OU blog

Personal Blogs

Christopher Douce

Teaching, Learning and Assessment of Databases

Visible to anyone in the world
Edited by Christopher Douce, Wednesday, 30 Jan 2019, 12:30

The 10th Teaching, Learning and Assessment of Databases (TLAD) workshop was held at the University of Hertfordshire on 9 July 2012. The University of Hertfordshire is one of those places that I have heard quite a lot about (from some friends and colleagues who have both visited and worked there), but until 9 July, I had never had the opportunity to visit. 

Although databases isn't my core subject it is one that I do have interest in, having been a software developer for quite a few years before joining the university.  Plus, the subject of databases (and their development) certainly crosses over with another big interest of mine, which is the psychology of computer programming.  Enough about me and my interests, and onto a summary of the event.

An effective higher education academy

Karen Fraser, who works in academic development within the HEA kicked off the day.  Karen once worked as a lecturer in computer science at the University of Ulster, before joining the HEA.  Karen talked about the objectives of the HEA and its current areas of focus.  These include the issue of employability amongst computing graduates and also supporting, promoting and developing teaching (and teacher) excellence. 

Other areas of interest include flexible learning, understanding mobility centred learning (a term that I had not come across before), and sustainable development in the sector.  Another area of focus includes supporting institutional strategy and change.

There were two other key parts to Karen's introduction: funding opportunities that the HEA can offer both individuals and academics, and mechanisms to accredit the teaching and skills of individuals.  In terms of funding, there are the teaching development grants, individual grants, departmental bids, collaborative bids and strategic development bits.  Anyone who is interested in finding out more should, of course, visit their website.

In terms of accrediting or recognising individuals the HEA runs what is called a fellowship scheme, where individuals can apply and submit evidence regarding their skills and practice.  I didn't know this (or, I had forgotten), but there is also something called a senior fellow scheme too.  Karen also mentioned the National Teaching Fellowship Scheme (NTFS) and the UK Professional Standards Framework (UKPSF).

On the subject of teaching quality, Karen drew our attention to a report entitled Dimensions of Quality by Professor Graham Gibbs. Apparently one of the main conclusions was that who performs the teaching was considered to be a more important measure of quality than the number of contact hours.

Towards the end of her talk, Karen briefly mentioned something called the HEA's 2016 strategic plan. The key points I noted were the aims to provide effective support to teachers and those involved in teaching and learning, to increase capacity and reward excellence, and offer influence to national policy.

Analyzing the influence of SQL teaching and learning methods and approaches

The first paper presentation of the day was by Huda Al-Shuaily from Glasgow University.  Huda presented what was a small section of her doctoral research. Huda drew our attention to earlier research by Ogden who presented a three stage cognitive model of working with SQL.  These included query formulation, query translation and query writing.  Huda considered that an additional category named query comprehension was perhaps necessary.

For each of these stages, Huda considered different issues.  For successful query formulation an understandable context is necessary (or set of appropriate examples or situations that are used to teach the concepts of databases) to help learners.  For query translation, where students convert queries between English and SQL, the ambiguity of English can be a particular difficulty.  For query writing, knowing something about the strategies that novices may adopt may be useful too; it was recommended that teachers emphasise the 'what' and 'how'.  An important point was: it is perhaps a good idea to teach students to read SQL before teaching them to how to write SQL.

One of the most interesting parts of her presentation was when she began to talk about patterns and SQL.  I have used generic programming patterns and had heard that they have been applied to other related areas such as usability, but never before databases.  Huda mentioned something called a 'self-join' pattern, which is one of a number of patterns that could be taught to students.

The question and answer section immediately opened up a number of interesting debates.  Regarding the subject of patterns there was some debate was to whether we ought to be teaching students general problem solving approaches rather than higher level abstractions such as patterns.  Another debate related to the type of data that we have within our datasets that are used to teach the underlying concepts.  Should we use real data (or, at least, real data that has been manipulated to avoid disclosure of sensitive records), or artificial made up data?

Temporal support in relational databases

Bernadette Marie Byrne from the University of Hertfordshire spoke about temporal support in relational databases whilst at the same time giving us some useful background information and presenting a case study.  Temporal databases were described as databases that are capable of recording what data has changed and when.  Apparently, there were debates were occurring in the SQL standards bodies about extending SQL to cater for temporal data when the focus of discussions changed due to the arrival of XML.  Some database vendors such as Oracle, however, have implemented certain temporal extensions.

A case study that Bernadette describes centres on a motorcycle and cycle hire business.  It is necessary to record when items are hired (and when they are returned), as well as knowing when items are available for hire.  An added complication is that 'partial hires' can be performed: some bicycles can be hired for, say, two days, and then swapped for another to ensure that an original customer hire request is satisfied.

It was clear to me that such a scenario (which I understand was drawn from a real-life situation) was one that was pretty tough to implement and would clearly show the challenges of working with time-centric data.  Another interesting consideration that sprung to my mind is the question of 'where do we write the code?'  In some cases we should rely on the functions of the database to solve our problem, whereas on other occasions we might want to write more program logic to cater for all the different situations that we come across.  Knowing where (and how) to write code is, of course, a part of the artistry of computer programming.

Roadmap for modernising database curricula

Jagdev Bhogal and Kathleen Maitland from Birmingham City University gave a very thought provoking presentation about we need to do, or could do to enhance the current database curricula.  Kathleen argued that databases are ubiquitous. On one hand, you might be accessing a server hosted database through a call from a mobile app.  On the other hand, your mobile app may contain its own database or data store of some kind.

One of the perceived problems is that databases are taught in bite size chunks in isolation from other modules.  Kathleen also argued that ideally modules should be connected together in some way and emphasised the need for different members of faculty should talk to each other.  Getting staff to work together has the potential to help students being able to create a portfolio of work (perhaps even functioning applications) that can be demonstrated to employers.

Employability is, of course, very important and curriculum design should directly address employability skills.  One such skill is that the professional writing and communication.  One approach to develop professional skills is to teach using substantial case studies such as those relating to the retail, banking, and government sectors.  Using case studies opens up the possibility of making use of very large databases and understanding the contexts in which they are situated.

Some topics that may be included in modules can include data modelling, data acquisition, approaches for data storage (including different ways of using mass storage devices, as well as saving data to the cloud), data searching (of both structured and unstructured data), processing, performance and security (which can include addressing subjects such as authentication and defence through depth).

The final conclusions that I've noted are that employability skills are necessarily important and that it is also important to get employers involved.  It is also important to consider how to improve the student experience by creating realistic scenarios. It also helps students to create assessment portfolios which can be used to demonstrate technical skills and abilities.

Research-informed and enterprise-motivated: Data mining and warehousing for final year undergraduates

Jung Lu from Southampton Solent University gave a presentation that focused on the teaching of data mining.  Jung highlighted that students had to consider a number of advanced research topics include XQuery, Weka (data mining), databases in the cloud, Oracle Apex, distributing and replicating data, accessing and manipulating data programmatically, and PL/SQL (Wikipedia) (stored procedures).

I made a note of a key point that related to the importance of practice.  It is necessary to ensure that students have sufficient time and resources to engage with practice activities and tasks before moving onto formally assessed activities.  'Screen time', as I call it, can give students confidence as well as experience that can stand them in good stead when it comes to the work place.

Subjects such as data warehousing and OLAP (Wikipedia) were said to be taught using a case study and a guest lecture (the importance of case studies being an issue that is featured later on within the workshop). Towards the end of the presentation, professional certifications were also mentioned.  Finally, a connection to employability skills, particularly SFIA, Skills Foundation for the Internet Age, was mentioned.  This framework may be able to offer some guidance about which skills may be particularly relevant or useful.

The teaching of relational on-line analytical processing (ROLAP) in advanced database courses

Bernadette Marie Byrne and Iftikhar Ansari both from the University of Hertfordshire talked about how to teach ROLAP, which is a database extension that I had never heard of before.   They began by referring to a very large dataset which had just under a million rows.  Other important considerations included that of performance.

As well as ROLAP being a new term to me, I was also introduced to a second one, which was 'star schema design'.  I think my unfamiliarity with these terms more relates to my background of using small to medium sized databases, rather than large and extensive data sets. One point was very clear: having hands of practical experience was something that was considered to be both important and necessary for students.

Introducing NoSQL into the database curriculum

The first ever database systems I used were based around the XBase language; early PC based databases such as Dbase, Clipper and Foxpro (which was back in the very early nineties).  From there I was introduced to the rigours of SQL, which is one of those languages that I've used off and on throughout my programming career. 

Clare Stanier from the Staffordshire University introduced what was to me a set of new database developments and innovations that has passed me by, namely NoSQL (or, perhaps post-SQL) databases: systems that enable users to more readily store unstructured data, perhaps in the form of documents.  Clare reminded us that that in the early days of databases there were many different types. Over time the SQL-based relational model approach became dominant.  Clare argued that we're now living in a database environment which is increasingly diverse.

The relational approach requires us to clearly structure our data.  Whilst on this can allow us to carry our complex queries, it can be difficult to create databases which can readily accommodate changing types of data.  NoSQL databases (NoSQL.org) permit weaker concurrency models and (I guess) you might also argue that some of them are more weakly typed.

Clare introduced us to a number of different databases.  Two notable ones include MongoDB which is apparently used to drive Craigslist, and CouchDB.  Apparently these two database projects have similar underlying objectives but there is a healthy rivalry between the two groups (which is no bad thing).

Another database (again, one that I had not heard of before) is Cassandra.  NoSQL databases have clearly made it into the mainstream.  Amazon have developed a database called SimpleDB, which can be used as a part of their cloud services.  Of course, cloud based databases have their own advantages and disadvantages, and developers always need to be mindful of these. Another aspect of NoSQL databases is that they have the potential to more readily (and perhaps easily) integrate with internet applications.  With some systems it might be easy to issue queries over REST (Wikipedia), for instance.

Clare made a very good point, which was that the TLAD community and lecturers who are involved in teaching databases and related subjects need to have a debate about what is taught in the database curriculum and the extent to which NoSQL databases need to feature. 

The distinctions between NoSQL and SQL databases remind me of a simplistic distinction between programming languages.  On one hand there are strictly typed languages, such as Java which require you to define everything.  On the other there are languages such as Perl which are weakly typed and allow developers to get into all kinds of muddles (whilst at the same time permitting certain categories of problems to be solved quickly and effectively, when such tools are placed in skilled hands).  There are, of course, other languages (and language mechanisms) in between.  I have little doubt that SQL and NoSQL databases may influence each other, but it remains a programmer and designers challenge to choose the most appropriate tool for the task in hand.

A ten-year review of workshops

David Nelson from University of Sunderland and Renuga Jayakumar from University of Wales Trinity Saint David presented an analysis of papers presented at TLAD over the last ten years.  David also attempted to present his view of what we might have to teach in the future (whilst also accepting that predicting the future is always a dangerous thing to try and do!)

Some of the broad themes that are covered in the workshop have included database design methods, e-learning tools, curriculum research, student diversity and assessment methods.  Some of the very early papers presented techniques for the automated assessment of database designs.  Over the years, technologies such as OpenMark (Open University) have matured.

Since the inception of TLAD, a range of new technologies have emerged and have been increasingly applied in different situations, such as XML.  With XML it is necessary to understand the fundamentals before fully appreciating its significance within the world of databases.  Papers regarding e-learning have included presentations about games, class participation, recording of lectures and how to best facilitate 'out of hours' learning.

Looking towards the future, we might see curriculum changes to take further account of transaction processing, system and data recovery, security, cloud computing and physical aspects of system design.  Mobility and non-relational databases as well as subjects such as data warehousing are considered to be significant subjects.

During the closing discussion, I also noted down the name of a resource that was new to me, namely, the Database Disciplinary Commons which is hosted by the University of Kent.

Reflections

I think this is my second TLAD workshop, the previous one that I attended was held at the University of Greenwich.  I enjoyed my first one and I enjoyed this one too.  I remain of the opinion that databases is a tough subject to teach, but one that is fundamentally very important to computer science education.  Lecturers need to convey fundamental concepts which, to some, may be significantly difficult to grasp.  The challenge becomes even more acute when we move more advanced subjects where issues such as software and hardware architecture need to be considered.  Security, of course, is another topic that is very important and there is a necessary connection between databases and the teaching of programming.

One point that I remember from my own database education (much of it acquired 'on the job' whilst working in industry), was that it became apparent that there were so many different ways to solve a problem.  I remember being presented with different techniques and having to make a decision about how to apply them.  Should I create a database abstraction layer for my application or use stored procedures, for example.  In my programming career I've even seen the horror of SQL intertwined with HTML tags!  Thankfully, the prevalence of design patterns, particularly MVC have gone a long way to emphasise the importance of separating out different aspects of an application.

All these ruminations suggest an important subject, which is how to most effectively convey best practice to our students.  Understanding the most appropriate ways to design systems and databases comes after acquiring fundamental skills.  This again connects to the view that teaching databases is a tough thing to do.

For me, there were two highlights of this TLAD.  The first relates to being aware of more on-line resources relating to learning and teaching (and being introduced to new technical terms), and secondly, being introduced to the concept of NoSQL.  My next challenge is to try to find some time to explore these new software technologies.  I hope I will be able to find the time and opportunity to do this.

Addendum

A few years after publishing this post, I was contacted by a reader, who mentioned that they had a website about the teaching of PL/SQL that contained a number of useful tutorials. If anyone is interested, here's a link to Ben Brumm's PL/SQL tutorials (Databasestar webite).

Permalink Add your comment
Share post
Christopher Douce

TLAD (Teaching, Learning and Assessment of Databases)

Visible to anyone in the world
Edited by Christopher Douce, Tuesday, 15 Nov 2011, 11:36

logo.jpg

I always enjoy visiting Manchester.  Having spent many years there (both as an undergraduate and a postgraduate) visiting Manchester almost feels as if I'm coming home.  Plus, travelling directly to the computer science building (as a computer scientist) feels as if I'm returning to a 'spiritual home'!

The reason for my most recent visit was to attend the 9th TLAD workshop, which was all about the teaching, learning and assessment of databases.  Below is a summary of the event.  I hope it is useful for someone.

Paper Session 1 - Data Mining

The first paper that was presented during the day was entitled 'Teaching Oracle Data Miner using Virtual Machines', by Qicheng Yu and Preeti Patel, both from London Metropolitan University, and expertly presented by Preeti.

Preeti made the point that employers are demanding up to date practical and technical skills.  The issue of employability was, of course, an issue that was discussed within an earlier HEA event entitled Enhancing the Employability of Computing Students.  To address the important issue of technical skills, educators are necessarily faced with the challenge of how to enable learners to make use of industrial tools and products.  One of the solutions (in the area of database and data mining education) was to make use of virtual machine technology, such as the Microsoft Virtual PC, Oracle Virtualbox and a product called VMWare Player. 

Some of the challenges that have to be addressed are technical i.e. how to share drives through a host computer, and financial or legal challenges, such as how to make sure that any solutions are correctly licenced.  Preeti pointed us to a product known as WEKA, which I had never heard of, but it seemed that many of the audience had!

The second paper of the morning was by Hongbo Du from the University of Buckingham.  His paper entitled, 'Data mining project: a critical element in teaching learning and assessment of a data mining module' won the 'best paper' prize of the workshop.  Hongbo mentioned what seemed to be an important paper in the area, namely, something called the CRISP-DM standard (wikipedia), which enable students to gain an understanding of the data mining process.  Hongbo also presented a general framework for assessment (within his paper) which drew upon this methodology before presenting three different case studies.

The key points that I took away from his presentation was: group work is important (but very often students don't want to do this since they might want to own their own scores completely), and that the domain of application is very important too, and represents a substantial area of complexity that students need to necessarily grapple with.

The final presentation of the first session, entitled, 'Making data warehousing accessible' was by Tony Valsamidis from the University of Greenwich.  Tony began by presenting some of the problems, such as the availability of large data sets (an issue that I shall return to later), unwieldy query languages and unfamiliar domain and data models. 

Tony spoke of a number of different technologies and techniques, some I had used in anger (such as Visual Basic for Applications), others I had heard of but had forgotten the meaning of (such as OLAP).  An important issue that was raised was that of data sanitization: when you move data between different systems, things might not be directly compatible, so database practitioners might have to design some magic data transformations.

Paper Session 2 - Database and the Cloud

The fourth presentation was rather different.  Mark Dorling from Langley Grammar School described his teaching practice and association with a project called Digital School House

Mark trained as a primary school teacher but he is now working within the secondary sector.  He described how he helps students to understand the key concepts of data, information and logical operators, sometimes applying kinaesthetic learning techniques (i.e. movement).  Mark also described the use of something called independent learning videos (allowing students to remember how to use elements of applications).  This reminded me of the industrial term of 'on-demand e-learning', where learners can call up screen casts or bite sized presentations about how to carry out or complete different tasks.

Mark also showed how secondary school pupils could make use of cloud application, such as Google Spreadsheet, to enable an entire class to enter data (and for the data to magically appear on an interactive whiteboard).  Mark's presentation (and paper) got me thinking about how I might potentially adopt some of the pedagogic techniques that he described into my own teaching practices.

Mark's description of how he makes use of Google Spreadsheet lead us directly to Clare Stanier's presentation which was entitled, 'Teaching Database for the Cloud'.  One of the things that I really liked about Clare's presentation was that she addressed a question I was already mulling over, namely, 'how is it possible to define cloud databases?'  She gave an answer that related to some NIST definitions (pdf). 

A cloud database can be considered in terms of Infrastructure as a Service, Platform as a Service and Software as a Service (such as Google Spreadsheet).  Depending on their task, developers will interface with different products (and different levels of abstraction).  Clare pointed us to interesting sounding products such as Microsoft Azure and Oracle on Demand, neither of which I had ever heard of before. 

This is obviously a rapidly changing field!  If you require any further references to either cloud related papers (or products), Clare might be able to share a useful URL.

Paper Session 3 - Embedding Technology

Jackie Campbellfrom the University of Leeds began the final session by presenting a paper entitled, 'Inquiry based learning database oriented applications for computing and computing forensic students'.  Two activities or tools were described.  The first was an SQL Quiz application, where students were challenged to compose correct SQL statements.  The second was more of a software maintenance task, where students were asked to investigate and carry out a number of fixes to an existing application developed using something called Oracle Apex.

I personally consider maintenance activities to be really valuable for a number of reasons.  Firstly, maintenance is a substantial on-going challenge.  Database designs are likely to be particularly affected if software applications or businesses merge.  Businesses, of course, continually change and evolve (as must the software systems that they support).  Understanding how (and where) to change or correct database queries may require students to navigate their way through unfamiliar systems.  This, in itself, is likely to be an intellectually challenging task.

The presentation by Craig McCreath (and supported by his supervisor Petra Leimich) reminded me of an earlier presentation about the JISC WILD project that was held at the HEA mobile event a number of weeks ago.  The underlying ideas were very similar: using mobile devices to elicit responses from students to (attempt) to assess their understanding of materials.  Craig did a fabulous job at presenting, and I had a sense that the application he had developed for his final year project was easy to use.

The final presentation was, 'Using Video to Provide Richer Feedback to Database Assessment' by Howard Gould, Leeds Metropolitan University. Howard addresses the important questions of: 'what kind of feedback would most effectively benefit our students?, how do we make the best use of our time to provide this feedback?, and, how might we practically provide video feedback?'

Howard's paper was, in essence, a practice paper; I think this type of paper can be really valuable.  One of the challenges that lecturers face is to offer effective and useful feedback on entity relationship diagrams, showing students alternative database designs.  I sense that providing feedback on any kind of non-written notation is something that is intrinsically very difficult (and can include notational systems such as mathematics and music).  Howard solves the problem by recording screen captures after having spent some time initially looking at a student submission.  Practical challenges include file sizes (the videos themselves can be very big), and on-screen flicker (since an inexpensive digital camera is used as opposed to an expensive licence for Camtasia).

Discussion

After all the paper sessions a discussion was initiated by Alistair Monger (Southampton Solent), David Nelson (University of Sunderland) and Charles Boisuert (Sheffield Hallam).

Alistair pointed us towards a number of useful assessment resources, beginning by mentioning that the QAA code of practice requires formative assessment.  He also mentioned REAP, an abbreviation for Reenginerring Assessment Practices in Higher Education, followed by TESTA, Transforming the Experience of Students through Assessment (JISC).  Alistair mentioned the importance of having an audit trail of feedback, and mentioned a system called GradeMark.

David raised the perennial issue of collusion and plagiarism and made the point that assessment should always be at such a level that it is difficult for students to quickly 'find answers'.  One solution might be to have time constrained assessments, perhaps in a computer lab (something that I remember from my days as a computing undergraduate), and the production of a portfolio which shows evidence of understanding.

Charles pointed us towards the idea of Nifty assessments.  Charles mentioned the point that markers mark in different ways and that there will be variability.  Another point was how to help those students who always seem to struggle with the subject.

The ensuing discussion took us into issues such as the importance of good feedback (and explaining the importance of why certain subjects are assessed), differences between both individual students and cohorts, and a reference to something called the Database Commons Initiative.

Summary

Although I'm not directly involved with the teaching of databases and database technologies I found this to be a very interesting event for a number of different reasons.  The first is the difference in the variety of database related topics that are now taught; things are rather different from my undergraduate days when database education began (and ended) with SQL syntax and learning about different types of joins.  The domain is now a lot richer than it ever was. 

There are now different levels of 'cloud' databases, small embedded databases, huge data warehouses, object-oriented databases and XML databases.  Other themes that might have been perhaps discussed are the connections between database teaching and software design, alerting students to issues such as making database abstraction layers and stored procedures (but perhaps these issues have been explored in earlier workshops).

The second reason also relates to this issue of richness.  Software engineers and system designers are now faced with a myriad of different choices in terms of architectures (how to set up large systems) as well as products, both commercial and open source.  Understanding what different products do and how they might support business objectives are issues that software professionals always need to bear in mind.  These 'industrially connected' points relate to the issue of certification.  The teaching and learning of databases is, perhaps, now an endeavour that is shared between academia and industry.  Perhaps the role of higher education teaching and learning in this area is to provide a useful context to help students to get to grips the more practical dimension of professional certifications.

Another thought that came to mind was I felt that there was a degree of useful cross over between other HEA events, particularly the joint employability and computing forensics event that I attended.  During the forensics part of this day I remember a fair amount of discussion about the sharing of 'data sets' which could be used to enable students to hone their computing forensic skills.  It struck me that database technology educators are faced with a similar challenge.

A final comment is I personally consider that the subject of databases is a pretty fundamental area of computing education.  I have to mention, of course, the Open University's own database course, M359 Relational databases: theory and practice.

Database education is an area is necessarily rich: students will be exposed to different languages, problem domains and wider system architectures and designs as soon as they set to work on 'real world' applications.  Databases represent a domain of software technology that ultimately helps 'to get jobs done'.  Where would we be without them?

Permalink Add your comment
Share post

This blog might contain posts that are only visible to logged-in users, or where only logged-in users can comment. If you have an account on the system, please log in for full access.

Total visits to this blog: 2237360