OU blog

Personal Blogs

Christopher Douce

A sketch of M813 Software Development and M814 Software Engineering

Visible to anyone in the world
Edited by Christopher Douce, Tuesday, 26 Nov 2024, 18:20

After becoming the module chair of TM354 Software Engineering, I had a look at two related postgraduate modules, M813 Software Development and M814 Software Engineering

These two modules sit alongside a number of other modules that make up the MSc in Computing programme. My intention was to see what related topics and subjects are taught, and whether there were any notable differences about how they were taught. 

This blog aims to highlight some of the key elements of these modules. To prepare this post, I had a good look through the module materials, including the assessment materials, and spoke with each of the module chairs. My intention of looking at these modules is to identify what themes and topics might potentially feed into a future replacement of TM354, or another related module. This summary is by no means comprehensive; the points I pick up on do, of course, reflect my interests.

I hope these notes are useful to anyone who is interested in either software engineering, or postgraduate computing, or both. Towards the end of the blog, I share a quick compare and contrast between the two modules and share some links to resources for anyone who might be interested.

M813 Software Development

M813 aims to “to provide the skills and knowledge necessary to develop software in accordance with current professional practice, approaches and techniques”.

The key module learning aims are to:

  • teach you a variety of fundamental techniques for software development across the software lifecycle, and to provide practice in the use of these techniques
  • give you enough knowledge to be able to choose between different development techniques appropriate for a software development context
  • make you aware of design and technology trade-offs involved in developing enterprise software systems
  • enable you to evaluate current software development practices
  • give you an understanding of current and emerging issues in software development
  • give you the research skills needed to stay at the leading edge of software development.

The module description suggests that students “will have an opportunity to engage with an organisational problem of your choice, working towards a fit-for-purpose software solution” and students “will also have an opportunity to carry out some independent research into issues in software development, including analysing, evaluating and presenting results”.

It makes use of a set text, Head First Design Patterns, accessed through the university library. To help students with the more technical bits, it shares some resources about a graphical tool, Visual Paradigm, which enables students to create diagrams using the Unified Modelling Language (UML).

The module has 10 units of study, which are spread over four blocks. The module’s assessment strategy summarised below, followed by each of the blocks.

Assessment strategy

Like many other modules, there are two parts of assessment: tutor marked assessments (TMAs), and an examinable component, which is an end of module assessment (EMA). Interestingly, the TMAs adopt a more practical and software development skills perspective, whereas the EMA is more about carrying out research which is applied to a study context. To pass the module, students need to gain an average score of 50% in both of the components.

TMAs 1 and 3 account for 30% of the continually assessed part of the module. Due to the practical focus of TMA 2, this assessment accounts for 40% of the overall TMA score.

Block 1: Software development and early lifecycle

This block is described as helping to “learn the principles and techniques of early software lifecycle, from requirements and domain analysis to software specification. You will engage with a number of practices, including capturing and validating requirements, and UML (Unified Modelling Language) modelling with activity and class diagrams.”

The model opens with a research activity which involves finding and reading academic articles. There are three other research activities which build on this first searching activity. These activities helps students to understand what the academic study of software engineering looks like. Plus, when working as a practicing software engineer, it’s important to know how to find and evaluate information about methods, approaches, and frameworks.

This unit beings to introduce students to a tool that they will use during the module; Visual Paradigm. Throughout the module, students will learn more about different UML diagrams, such as use cases, class diagrams, and activity diagrams.

Unit 1, introducing software development, shares a couple of perspectives: a philosophical perspective and a historical perspective (history is always useful), before mentioning risk, quality and then moving onto starting to look at UML.

Unit 2, requirements and use cases, covers the characteristics of requirements and the forms that they can be presented. Unit 3, from the context to the system, starts with activity diagram (which are all about representing a context) through to class diagrams, which is all about beginning to realise a design of software using abstractions. Finally, unit 4, specifying what the system should do, touches on more formal aspects of software specification.

Block 2: Design and code

This next block explores “principles and techniques of software design, construction, testing and version control”. Other topics include design patterns, UML modelling with state diagrams and creating of software using the Java language. Out of all the blocks in the module, this is the one that has a really practical focus.

In addition to links to further video tutorials about Visual Paradigm, there’s some guidance about how to start to use Microsoft Visual Studio Code, and some initial development activities.

Unit 5, design, introduces some basic design principles, and new forms of diagram: communication diagrams and object diagrams. Unit 6, from design to code, shares a bit more detail about the principles of object-oriented programming, and goes onto introducing the topic of configuration management. Unit 7, design patterns, continues the theme of object-oriented programming by introducing a set of patterns from the Gang of Four text, which is complemented by a software development activity. 

Block 3: Software architectures and systems integration

Block 3 goes up a level to explore how to “develop software solutions based on software architectures and frameworks”. 

Unit 8, software architectures introduces the notion of architectural patterns, and how to model patterns using UML. Another useful topic introduced is state machines. An important theme that is highlighted is the idea of layer of software which, in turn, is linked to the notion of persistence (which means ‘how data can be saved’). This is complemented by unit 9, component-based architectures, which offers a specific example.  The module concludes with unit 10, service-oriented architectures.

Block 4: EMA preparation

This fourth block relates to the module’s end of module assessment (EMA), where students have to carry out some applied research into a software context in which they are familiar with. To help students to prepare, there are some useful preparatory resources.

Reflections

I really liked that this module brings in a bit of history, describing the history of object-oriented programming. I also liked that it shared some really useful descriptions about the differences between scholarship and research. There are some common elements between M813 and TM354, such as requirements and the use of UML, but I’ll say more about this in a later section.

M814 Software Engineering

M814 is “about advanced concepts and techniques used throughout the software life cycle” and replaces two earlier 15 point modules: M882 Software Project Management and M883 Requirements Engineering.

The module aims are to:

  • develop your ability in the critical evaluation of the theories, practices and systems used in a range of areas of Computing
  • provide you with a specialised area of study in order that you can experience and develop the frontiers of practice and research in focused aspects of Computing and its application
  • encourage you, through the provision of appropriate educational activities, to develop study and transferable skills applicable to your employment and continuing professional development
  • enable you to develop a deeper understanding of a specialist area of Computing and to contribute to future developments in the field.

Although this module is less ‘applied’ than M813, there are some important elements. Students make use Git and GitHub, and use a simulation and modelling tool, InsightMaker.

The module has four study blocks, containing 26 study units; a lot more than M813. These are summarised in the following sections. Students are also required to consult a set text, Mastering the requirements process by Robertson and Robertson, which is also available through the OU Library.

Assessment strategy

The module has three TMAs and an end of module exam, which is taken remotely (as opposed to an EMA). TMAs 1 and 3 have a weighting of 30% each, with TMA 2 being slightly more substantial, accounting for 40%. Students have to pass both the TMAs and the exam, gaining an average of 50% in each.

The exam covers all module learning outcomes and is split into two sections. For the second section students would have needed to be familiar with a research article.

Block 1: Software engineering context

The first two units, unit 1, software in the information society and unit 2, the organisational and business context, introduces software engineering. This is followed by an introduction to the organisational context through unit 3, organisational context, codes and standards. The title of this unit refers to professional codes, and professional and technical standards. Accompanying topics include software and the law, which includes intellectual properly, trademarking, patents, and data protection (GDPR) legislation. The final unit, unit 4, addresses ethics and values in software engineering.

Block 2: Software engineering methods and processes

Block 2 concerns software engineering methods and processes. The first two units highlights the notion of the process model, project management, and quality management, which includes the ISO 9001 standard and the Capability Maturity Model (CMMI). These are presented in unit 6, software activities and unit 7, software engineering processes. 

The module then covers unit 8, agile processes and unit 9, managing resources, which includes materials about SCRUM, Kanban, and something called the SAFe framework, which is a set of workflow patterns for implementing agile practices. There is also a case study which describes how agile is used in practice. I remember seeing some photographs that show how developers have been sharing information about project status using whiteboard and other displays. The module concludes with unit 10,  managing uncertainty and risk, and unit 11, software quality.

A part of this block makes use of simulation, introducing a ‘simulation modelling tool’ which can be used to experiment with the concept of Brooks’ law. As an aside, this reminds me of a short article https://ppig.org/papers/2002-ppig-14th-hales/ that touched on a similar topic. In the context of M814, I like how the idea of simulation has been applied in an interesting and pedagogically helpful way.

Block 3: Software deployment and evaluation

Block 3 concerns software deployment and evolution. In other words, what happens after implementation. It includes some materials about DevOps (the integration of development with the operation of software), and continual integration and delivery. There are three units: unit 12, software configuration management, which introduces Git and GitHub, unit 13, software deployment, and unit 14, software maintenance and evolution.

This block returns to simulation, specifically exploring Lehman’s 2nd law (Wikipedia), which means that software complexity increases unless something is done to reduce it. Students are also directed to a text book, Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation, by Humble and Farley. 

Block 4: Back to the beginning

The final block returns to the beginning by looking at requirements engineering, extensively drawing on the set text, Mastering the Requirements Process. It introduces what is meant by requirements engineering, a subtopic within software engineering. Unit titles for this block includes scoping the business problem, functional and non-functional requirements, fit criteria and rationale, ensuring quality of requirements, and reusing requirements. The block concludes with a useful section: unit 26, current trends in software engineering.

Reflections

I really liked the introductory sections to this module; they adopt a philosophical tone. I also really like how it uses case studies. What is notable is that there are a lot of materials to get through, but all the topics and units are certainly appropriate and are needed to cover the module in a good amount of depth.

Similarities and differences

There is understandably some cross over between M813 and M814; they complement each other. M813 is more of an ‘applied’ module than either M814 or TM354, but M814 does contain a few practical elements. It’s use of simulations is particularly interesting. In comparison to the undergraduate software engineering module, TM354, the two postgraduate modules do clearly require the application of higher academic skills, such as understanding what it means to carry out scholarship.

In my opinion, there appear to be more similarities between M813 and TM354 than with M814. It is worth noting that TM354 introduces topics that can be found in both postgraduate modules.

TM354 and M813 both emphasise design patterns. An important difference is that in M813, students are required to demonstrate how patterns might be applied, whereas on TM354 students have to necessarily demonstrate their understanding of design patterns that have been chosen by the module team. Both modules also explore the notions of software architectures and state machines.

There are differences between TM354 and M813 in terms of tools. TM354 steers away from the use of diagramming tools, but by way of contrast, M813 makes extensive use of Visual Paradigm. TM354 makes use of NetBeans for the design patterns task, whereas M813 introduces students to Visual Studio Code.

By way of contrast, M814 covers wider variety of concepts which are important to the building of ‘software in the large’; the importance of software maintenance and the characteristics of software quality.

UML is featured in all three modules. They all refer to software development methods and requirements engineering. Significantly, they all use the Roberston and Robertson text. The differ in terms of the depth they explore the topic.

To conclude, software development and software engineering are huge subjects. The three modules that are mentioned in this blog can only begin to scratch the surface. Every problem will have a unique set of requirements, and every problem will require different methods. There are two key elements: people and technology. Software is designed by people and used by people. Where there’s people, there’s always complexity. Adding technology in the mix adds an additional dimension of complexity.

Resources

The following links takes you to some useful OpenLearn resources:

Acknowledgements

Many thanks to Arosha Bandara who spent some time introducing me to some the key elements of M814. I also extend thanks to Yujin Yu. Both Arosha and Yujin are professors of software engineering. The current chair of M814 is Professor Andrea Zisman, who is also a professor of software engineering. Thanks are also extended to the TM354 module team: Michael Ulman, Richard Walker, Petra Wolf and Andrea Zisman.

Permalink Add your comment
Share post
Christopher Douce

Generative AI and the future of the OU

Visible to anyone in the world
Edited by Christopher Douce, Tuesday, 20 June 2023, 10:24

On 15 June 2023 I attended a computing seminar about generative AI, presented by Michel Wermelinger.

In some ways the title of his seminar is quite provocative. I did feel that his presentation relates to the exploration of a very specific theme, namely, how generative AI can play a role in the future of programming education; a topic which is, of course, being explored by academics and students within the school.

What follows is a brief summary of Michel's talk. As well as sharing a number of really interesting points and accompanying resources, Michel did a lot of screensharing, where he demonstrated what I could only describe as witchcraft.

Generative AI tools

Michel showed us Copilot, which draws on code submitted through GitHub. Copilot is said to use something called OpenAI Codex. The witchcraft bit I mentioned was this: Michel provided a couple of comments in a development environment, which were parsed by the Copilot, which generated readable and understandable Python code. There was no messing about with internet searches or looking through instruction books to figure out how to do something. Copilot offered immediate and direct suggestions.

Copilot isn’t, of course, the only tool that is out there. There are now a bunch of different types of AI tools, or a taxonomy of tools, which are emerging. There are tools where you pay for access. There are tools that are connected with integrated development environments (IDEs) that are available on the cloud, and there are tools where the AI becomes a pair programmer chatbot. There are other tools, such as learning environments that offer both documentation and the automated assessment of programming assignments.

The big tech companies are getting involved. Amazon has something called CodeWhisperer. Apparently Google has something called AlphaCode, which has participated in competitive programming competitions, leading to a paper in Nature which questions whether ChatGPT and AlphaCode going to replace programmers? There’s also something called StarCoder, which has also been trained on GitHub sources.  

AI can, of course, be used in other ways. It could be used to offer help and support to students who have additional requirements. AI could be used to transcribe lectures, and help student navigate across and through learning materials. The potential of AI being a useful learning companion has been a long held dream, and one that I can certainly remember from my undergraduate days, which were in the last century.

Implications

An important reflection is that Copilot and all these other AI tools are here to stay. It wouldn’t be appropriate to try to ban them from the classroom since they are already being used, and they already have a purpose. Michel also mentioned there is already a textbook which draws on Generative AI: Learn AI-assisted Python programming

Irrespective of what these tools are and what they do, everyone still needs to know the fundamentals. Copilot does not replace the need to understand language syntax and semantics and know the principles of algorithmic thinking. Developers and engineers need to know what is meant by thorough testing, how to debug software, and to write helpful documentation. They need to know how to set breakpoints, use command prompts, and also know things about version and configuration management.

An important question to ask is: how do we assess understanding? One approach is an increasing use of technical interviews, which can be used to assess understanding of technical concepts. This won’t mean an academic viva, but instead might mean some practical discussions which both help to assess student’s knowledge, and help them to prepare for the inevitable technical interviews which take place in industry.

New AI tools may have a real impact on not only what is taught but how teaching is carried out, particularly when it comes to higher levels of study. This might mean the reformulation of assignments, perhaps developing less explicit requirements to expose learners to the challenge of working with ambiguity, which students must then intelligently resolve.

Since these tools have the potential to give programmers a performative boost, assignments may become more bigger and more substantial. Irrespective of how assignments might change there is an imperative that students must learn how to critically assess and evaluate whatever code these tools might suggest. It isn’t enough to accept what is suggested; it is important to ask the question: “does the code that I see here make sense of offer any risks, given what I’m trying to do?”

A term that is new to me is: prompt engineering. This need to communicate in a succinct and precise way to an AI to get results that are practical and useful within a particular context. To get useful results, you need to be clear about what you want. 

What is the university doing?

To respond to the emergence of these tools the university has set up something called the Generative AI task and finish group. It will be producing some interim guidance for students and will be offering some guidance to staff, which will include the necessity to be clear about ethical and transparent use about AI. It is also said to highlight capabilities and limitations.  There will also be guidance for award boards and module results panels. The point here is that Generative AI is being looked at. 

Michel suggested the need for a working group within the school; a group to look at what papers coming out, what the new tools are, and what is happening across the sector at other institutions. A thought that it might be useful to widen it out to other schools, such as the School of Physical Sciences, and any others which make use of any aspect of coding and software development.

Reflections

Michel’s presentation was a very quick overview of a set of tools that I knew very little about. It is now pretty clear that I need to know a lot more about them, since there are direct implications for the practice of teaching and learning, implications for the school, and implications for the university. There is a fundamental imperative that must be emphasised: students must be helped to understand that a critical perspective about the use of AI is a necessity.

Although I described Michel’s demonstration of Copilot as witchcraft all he did was demonstrate a new technology.

When I was a postgraduate student, a lecturer once told me that one of the most fundamental and important concepts in computing was abstraction. When developers are faced with a problem that becomes difficult, they can be said to ‘abstract up’ a level, to get themselves out of trouble, and towards another way of solving a problem. In some senses, AI tools represent a higher level of abstraction; it is another way of viewing things. This doesn’t, of course, solve the problem that code still needs to be written.

I have also heard that one of the fundamental characteristics of a good software developer or engineer is laziness. When a programmer finds a problem that requires solving time and time again, they invariably develop tools to do their work for them. In other words, why write more code than you need to, when you can develop a tool that solves the problem for you?

My view is that both abstraction and laziness are principles that are connected together.

Generative AI tools have the potential to make programmers lazy, but programmers must gain an appreciation about how and why things work. They also need to know how to make decisions about what bits of code to use, and when. 

It takes a lot of effort to become someone who is effective at being lazy.

Permalink Add your comment
Share post
Christopher Douce

TM354 Software Engineering: briefing

Visible to anyone in the world
Edited by Christopher Douce, Monday, 11 Sept 2023, 16:27

On Saturday 27 September I went to a briefing for a new OU module, TM354 Software Engineering.   I have to secretly confess that I was quite looking forward to this event for a number of reasons: I haven’t studied software engineering with the OU (which meant that I was curious), I have good memories of my software engineering classes from my undergraduate days and I also used to do what was loosely called software engineering when I had a job in industry.  A big question that I had was: ‘to what extent is it different to the stuff that I studied as an undergrad?’  The answer was: ‘quite a bit was different, but then again, there was quite a bit that was the same too’.

I remember my old undergrad lecturer introducing software engineering by saying something like, ‘this module covers all the important computer stuff that isn’t in any of the other modules’.   It seemed like an incredibly simple description (and one that is also a bit controversial), but it is one that has stuck in my mind.  In my mind, software engineering is a whole lot more than just being other stuff.

This blog post summary of the event is mostly intended for the tutors who came along to the day, but I hope it might be useful for anyone else who might be interested in either studying or tutoring the module.  There’s information about the module structure, something about the software that we use, and also something about the scheduling of the tutorials.

Module structure

TM354 has three blocks, which are also printed books.  These are: Block 1 – from domain to requirements, Block 2 – from analysis to design, and Block 3 – from architecture to product.  An important aspect to the module is a set of case studies.  The module is also supported by a module website and, interestingly, a software tool called ShareSpace that enables students to share different sketches or designs.  (This is a version of a tool that has been used in other modules such as U101, the undergraduate design module, and T174, an introduction to engineering module).

Block 1 : from domain to requirements

Each block contains a bunch of units.  The first unit is entitled ‘approaches to software development’, which, I believe, draws a distinction between plan driven software development and agile software development.  I’ve also noted down the phrase ‘modelling with software engineering’.  It’s great to see agile mentioned in this block, as well as modelling.  When I worked in industry as a developer, we used bits of both.

The second unit is called requirements concepts.  This covers functional requirements, non-functional (I’m guessing this is things like ‘compatibility with existing systems’ and ‘maintainability’ – but I could be wrong, since I’ve not been through the module materials yet), testing, and what and how to document.  Another note I’ve made is: ‘perspectives on agile documentation’.

Unit three is from domain modelling to requirements.  Apparently this is all about business rules and processes, and capturing requirements with use cases.  Prototyping is also mentioned.  (These are both terms that would be familiar with students who have taken the M364 Interaction Design module).  Unit four is all about the case study (of which I have to confess that I don’t know anything about!)

Block 2: from analysis to design

Unit five is about structural modelling of domain versus the solution.  Unit six is about dynamic modelling, which includes design by contract.  Unfortunately, my notes were getting a bit weak at this point, but I seem to remember thinking, ‘ahh… I wonder if this relates to the way that I used to put assertions in my code when I was a coder’.  This introduction was piquing my interest.

Unit seven was entitled, ‘more dynamic modelling’, specifically covering states and activities, and capturing complex interactions.  Apparently the black art of ‘state machines’ are also covered in this bit.  (In my undergrad days, state machine were only covered in the completely baffling programming languages course) .  Unit eight then moves onto the second part of the case study which might contain domain modelling, analysis and design.

Block 3: from architecture to product

This block jumped out at me as being the most interesting (but this reflects my own interests).  Unit nine was about ‘architecture, patterns and reuse’.  Architecture and requirements, I’ve noted, ‘go hand in hand’.  In this section there’s something about architectural views and reuse in the small and reuse in the large.  During the briefing there was a discussion about architectural styles, frameworks and software design patterns.

When I was an undergrad, software patterns hadn’t been discovered yet.  It’s great to see them in this module, since they are a really important subject.  I used to tell people that patterns are like sets of abstractions that allow people to talk about software.  I think everyone who is a serious software developer should know something about patterns.

Unit ten seems to take a wider perspective, talking about ‘building blocks and enterprise architectures’.  Other topics include component based development, services and service oriented architectures (which is a topic that is touched upon in another module, and also potentially the forthcoming TM352 module that covers cloud computing).

Unit eleven is about quality, verification, metrics and testing.  My undergrad module contained loads of material on metrics and reliability, and testing was covered only in a fairly theoretical way, but I understand that test-driven development is covered in this module (which is a topic that is linked to agile methods).  I’ll be interested to look at the metrics bit when this bit of the module is finalised.

The final unit takes us back to the case study.  Apparently we look at architectural views and patterns.  Apparently there are also a set of further topics that are looked.  I’m guessing that students might well have to go digging for papers in the OU’s huge on-line library.

Software

I’ve mentioned ShareSpace, which is all about sharing of software models with other students (modelling is an important skill), to enable students to gain experience of group work and to see what other students are doing and creating: software development invariably happens in teams.  Another important bit of software is an open source IDE (integrated development environment) called NetBeans.  I’m not sure how NetBeans is going to be used in this module, but it is used across a number of different OU modules, so it should be familiar to some TM354 students.

Assessment

TM354 comprises of three tutor marked assignments, a formative quiz at the end of every unit (that students are strongly encouraged to complete), and an end of module exam.  The exam comprises of two parts: a part that has questions about concepts, and a second bit that contains longer questions (I can’t say any more than this, since I don’t know what the exam looks like!)

Tutorials

Each tutor is required to deliver two hours of face to face tuition, and eight hours of on-line sessions through OU Live (as far as I understand).  In the London region, we have three tutors, so what we’re doing is we’re having all the groups come to the same events and we’re having each tutor deliver a face to face session to support students through every block and every TMA. 

We’re also planning on explicitly scheduling six hours of OU Live time, leaving two hours that the tutor can use at his or her discretion throughout the module (so, if there are a group of students who struggle with concepts such as metrics, design by contract, or patterns, a couple of short ad-hoc sessions can be scheduled). 

All the OU Live sessions will be presented through a regional OU Live room.  This means that students in one tutor group can visit a session that is delivered by another London tutor.  The benefit of explicitly scheduling these sessions in advance is that all these events are presented within the student’s module calendar (so they can’t say that they didn’t know about them!)  All these plans are roughly in line with the new tuition strategy policy that is coming from the higher levels of the university.  A final thought regarding the on-line sessions is that it is recommended that tutors record them, so students can listen to the events (and potentially go through subjects that they find difficult) after an event has taken place.

A final note that I’ve made in my notebook is ‘tutorial resources sharing (thread to share)’.  This is connected to a tutor’s forum that all TM354 tutors should have access to.  I think there should be a thread somewhere that is all about the sharing of both on-line and off-line (face to face) tutorial resources.

Permalink Add your comment
Share post
Christopher Douce

Considering Middleware and Service Oriented Architecture

Visible to anyone in the world
Edited by Christopher Douce, Wednesday, 21 July 2010, 18:25

4815941514_0b0c87dda6_m.jpg

I wrote the following notes some time ago as a way to share information about the subject of middleware and service-oriented architecture.  I think I began by asking the questions 'what is middleware?' and 'what can it do for us?', explicitly in the context of making information systems that can help to support the delivery of useful services to support learning.

I should add a disclaimer: some of the stuff that is presented here is quite technical and seems quite a long way away from my earlier posts that relate to accessibility, but there are connections in terms of understanding how to build information systems that can help an organisation to manage the delivery of accessibility services (such as the loan of assistive technology).

Beginning my search

I began by exploring a number of definitions.  I first attacked the notion of workflow (Wikipedia).  What does workflow mean?  Is it one of those terms that can have different meanings to different people?  I rather like the Wikipedia definition, which goes:

  • A workflow is a reliably repeatable pattern of activity enabled by a systematic organization of resources, defined roles and mass, energy and information flows, into a work process that can be documented and learned. Workflows are always designed to achieve processing intents of some sort, such as physical transformation, service provision, or information processing.

I then asked myself, 'how does the idea of workflow relate to the notion of middleware?' (I had heard they were connected, but wasn't quite sure how).  Again, the Wikipedia definition of middleware proved to be useful:

  • Middleware is the software that sits 'in the middle' between applications ... stretched across multiple systems or applications. ... The software consists of a set of enabling services that allow multiple processes running on one or more machines to interact across a network. This technology evolved to provide for interoperability in support of the move to client/server architecture. It is used most often to support complex, distributed applications. ... Middleware is especially integral to modern information technology based on XML, SOAP, Web services, and service-oriented architecture.

So, these two ideas are connected.  Carrying out workflow may involve making use of a number of different services, which might be able to call through some sort of middleware...

Further links

A little more digging pointed me to a number of other directions.  Clever people have proposed something called BPEL, an abbreviation for Business Processing Execution Language.  Wikipedia is again useful:

  • WS-BPEL (or BPEL for short) is a language for specifying business process behavior based on Web Services. Processes in WS-BPEL export and import functionality by using Web Service interfaces exclusively.

On this page, there is a link to a blog post which is a very good primer and introduction.  It is lots more clearer than the Wikipedia page.

I found the following text to be useful:

  • In BPEL, a business process is a large-grained stateful service, which executes steps to complete a business goal. That goal can be the completion of a business transaction, or fulfilling the job of a service. The steps in the BPEL process execute activities (represented by BPEL language elements) to accomplish work. Those activities are centered on invoking partner services to perform tasks (their job) and return results back to the process.

Interestingly, it also contained the following:

  • As for limitations, BPEL does not account for humans in a process, so BPEL doesn't provide workflow-there are no concepts for roles, tasks and inboxes.

We are almost at the point where the same terms may be used to mean different things.  Perhaps there is a difference between what workflow is and what business processes are?  Michelson (the blog author) seems to equate workflow with 'things that people do'.  The point is that a wide definition of workflow can include things that BPEL does not.

At this point, I was wondering, 'if I have a process (say, a task that I have to complete), where half of the task has to be completed by a machine and the other half has to be completed by a person, then what technologies should I use?'.  All is not lost.  The blog mentions there is something called  BPEL4People (Wikipedia), and contains a link to an IBM whitepaper.

I've extracted some fragments that caught my eye:

  • The BPEL specification focuses on business processes ... But the spectrum of activities that make up general purpose business processes is much broader. People often participate in the execution of business processes ...

Following this, I stumbled across the following scenario:

  • Consider a service that takes place out-of-sight of the initiating process. In many circumstances, it may be immaterial as to whether the service is performed with or without user interaction, for example, a document translation service.

This made me wonder about my own involvement in the EU4ALL project, which is exploring processes that enable lecturers to order alternatives formats, such as tactile maps or other kinds of materials, for instance.

Application Servers

BPEL is represented using something called XML (Wikipedia), which is, of course (more or less) a text file that has lots of structure (created by the enthusiastic use of angled brackets).

BPEL is not the only way to represent or describe business processes (or workflow).  Another approach might be to use something called State Chart XML (SCXML), for instance.   There are probably loads more other data structures or standards you might use.

At this point, you might be asking, "okay, so there are these magic XML data structures that allow you to describe entire processes but how do you make this stuff real so people can use it?".  The answer is use something called an Application Server (Wikipedia).

Here, I am again lazy and quote from Wikipedia:

  • Application server products typically bundle middleware to enable applications to intercommunicate with dependent applications, like web servers, database management systems ...

Although an application server may be able to run middleware (and potentially sequence the order in which activities are carried out), we need to add interfaces so people can interact with it.

Always being the pragmatist, I asked myself another question, 'all this sounds like good fun, but where can I find one of these application servers that does all this magic stuff to manage our workflow and processes?'  I don't have a precise answer to this question, but I did find something called Apache ODE.

To quote from the project website,

  • Apache ODE (Orchestration Director Engine) executes business processes written following the WS-BPEL standard. It talks to web services, sending and receiving messages, handling data manipulation and error recovery as described by your process definition. It supports both long and short living process executions to orchestrate all the services that are part of your application.

Another distinction (as opposed to long and short running processes) include processes that require human intervention (or actions) and those that can run on their own, such as executing a database query or sending messages to another part of a large organisation to request the availability of resources.

All this sounds great!  All I have to do now is to find some time to study this stuff further.

Other approaches

Whilst reading all this stuff, the purpose of other products that never made sense to me started to become clear.  A couple of years ago, I had heard something called Biztalk mentioned, but never properly understood what it was.  Again, Wikipedia is useful, describing Biztalk (Wikipedia) as

  • a business process management (BPM) server. Through the use of "adapters" which are tailored to communicate with different software systems used in a large enterprise, it enables companies to automate and integrate business processes.

I've not looked into this very deeply, but it also seems that the House of Microsoft might have concocted something of their own called the Windows Workflow Foundation (Wikipedia) which I understand also connects to the topic of BPEL.

Of course, there's a whole other set of terms and ideas that I haven't even looked at.  These include technologies and ideas such as an enterprise service bus (ESB), message queues, message-oriented middleware (MOM), the list goes on and on...

A summary (of sorts)

The issue of service-oriented architecture design goes a lot deeper than simply creating a set of solitary web services running on different systems.  Designers need to consider how to ensure that messages are received successfully, how to consider or address redundancy and how to measure or ensure performance.  The ultimate choice of architectural components and elements depend very much on your requirements, the boundaries of your organisation, your needs for communication and who you need to communicate with.

What I found surprising was the number of technologies that could be potentially used within the project that I'm currently working on.  The ultimately choice of technologies are likely to boil down to the key issue of: 'what do we know about right now', and 'what is the best thing we can do'.

Footnote

I was going to add a footnote to one of the earlier sections, but because my notes have turned into a blog post, I've decided to put it here.

I like this stuff because it reminds me of two areas that always fight with each other: software maintenance and business process re-engineering.  Business practices can change more quickly than software systems.  The need for process flexibility (and abstraction) is one motivation that has driven the development of things like SOA and BPEL.

This stuff is also interesting because workflow is where the world of 'work' and the world of software nearly combine.  There is another dimension: would you like a computer telling you what to do?  Also, no matter how much we try to be comprehensive in our understanding of a particular institution there will always be exceptions.  Any resulting architecture should ideally try to accommodate change efficiently.

Middleware (in some senses) can have a role to play in terms of gathering information about the performance of services, i.e. how long it takes for certain actions to a certain kind of request, and has the potential toe manage the delivery of interventions (such as issue escalation to supervisors) should service quality be at risk.

Acknowledgements

Blog image is a modified version of the one found on the Wikipedia SOA page.  I also cheekily consulted an O'Reilly book when I was preparing an earlier version of these notes, but I've long since returned it to the library (and I can't remember its title).

Permalink
Share post
Christopher Douce

Green Code

Visible to anyone in the world
Edited by Christopher Douce, Wednesday, 23 Oct 2024, 20:48

It takes far too long for my desktop PC to finish booting up every morning.  From the moment I throw the power switch of my aging XP machine to the on position and click on my user name, I have enough time to walk to the kitchen, brew a cup of tea, do some washing and tidying up and drink half my cup of tea (or coffee), before I can begin to load all the other applications that I need to open before settling down to do some work.

I would say it takes over fifteen minutes from the point of power up to being able to do some 'real stuff'.  All this hanging around inevitably sucks up quite a bit of needless energy.  Even though I do have some additional software services installed, such as a database and a peer-to-peer TV application, I don't think my PC is too underpowered (it's a single core running just over a gigahertz with half a gig of memory).

Being of a particular age, I have fond memories of the time when you turned on a computer, the operating system (albeit a much simpler one) was almost instantly available. Ignoring the need to load software from cassettes or big floppy disks, you could start to issue commands and do useful stuff within seconds of powering up.

This is one of the reasons why I like my EEE netbook (Wikipedia): if I have an idea for something to write or want to talk to someone or find something out, then I can turn it on and within a minute or two it is ready for use. (As an aside, I remember reading in Insanely Great by Steven Levy (Amazon) the issue of boot up time was an important consideration when designing the original Macintosh).

Green Code

These musings make me wonder about the notion of 'green code': computer software that is designed in such a way that it supports necessary functionality by demanding a minimal amount of processor or memory resources. Needless to say, this is by no means an original idea. It seems that other people are thinking along similar lines.

In a post entitled, Your bad code is killing my planet, Alistair Croll writes, 'Once upon a time, lousy coding didn't matter. Coder Joel and I could write the same app, and while mine might have consumed 50 percent of the machine's CPU whereas his could have consumed a mere 10 percent, this wasn't a big deal. We both paid for our computer, rackspace, bandwidth, and power.'

Croll mentions that software is often designed in terms of multiple levels of abstraction. He states that there can be a lot of 'distance and computing overhead between my code and the electricity of each processor cycle'. He goes on to write, 'Architecture choices, and even programming language, matter'. Software architecture choices do matter and abstractions are important.

Green Maintenance

Making code that is efficient is only part of the story. Abstractions allow us to hide complexity. They help developers to compartmentalise and manage the 'raw thought stuff' which is computer code. Well designed abstractions can give software developers who are charged with working and maintaining existing systems a real productivity boost.

Code that is easier to read and work with is likely to be easier to maintain. Maintenance is important since some researchers' report that maintenance accounts for up to 70% of costs of a software project.

In my opinion, clean code equals green code. Green code is code that should be easy to understand, maintain and adapt.

Green Challenges

Croll, however, does have a point. Software engineers should need to be aware of the effect that certain architectural choices may have on final system performance.

In times when IT budgets may begin to be challenged (even though IT may be perceived as technology that can help to create business and information process efficiencies), the request for an ever more powerful server may be frowned upon by those who hold the budgetary purse strings. You may be asked to do more with less.

This challenge exposes a fundamental computing dilemma: code that is as efficient as it could be may be difficult to understand and work with. Developers have to consider such challenges carefully and walk a careful path of compromise. Just as there is an eternal trade off between speed of a system and how much power is consumed, there is also a difficult trade offs to consider in terms of efficiency and clarity, along with the dimensions of system flexibility and functionality.

One of the reasons why Microsoft Vista is not considered to be popular is the issue of how resource hungry it is in terms of memory, processor speed and disk drive space. Microsoft, it seems is certainly aware of this issue (InfoWorld).

Turning off some of the needless eye candy, such as neatly shaded three dimensional buttons, can help you to get more life out of your PC. This is something that Ted Samson mentions, before edging towards discussing the related point of power management.

Ted also mentions one of those well known laws of computing. He writes, 'just because there are machines out there that can support enormous system requirements doesn't mean you have to make your software swell to that footprint'. In other words, 'your processor and disk space needs expands to the size of your machine' (another way of also saying 'your project expands to the amount of time you have available'!)

Power Budgets

Whilst I appreciate my EEE PC in terms of its quick boot up time, it does have an uncomfortable side effect: it also acts as a very effective lap warmer. Even more surprisingly, its batteries are entirely depleted within slightly over two hours of usage. A mobile device should not be tethered to a mains power supply. It also makes me wonder about whether its incessant demand for power is going to cut short the life of its batteries (which represent their own environmental challenge).

When working alongside electrical engineers, I would occasionally over hear them discussing power budgets, i.e. how much power would be consumed by components of a larger electrical system. In terms of software, both laptop and desktop PC offer a range of mysterious software interfaces that provide 'power management' functionality. This is something that I have not substantially explored or studied. For me, this is an area of modern PCs that remain a perpetual mystery. It is certainly something that I need do to something about.

Sometimes, the collaboration between software developers and hardware engineers can yield astonishing results. I again point towards the One Laptop per Child project. I remember reading some on-line discussions that described changes that were made to the Linux operating system kernel to make the OLPC device more power efficient. A quick search quickly throws up an Environmental Impact page.

The OLPC device, whether you agree with the objective of the OLPC project or not, has had a significant impact on the design of laptop systems. A second version of the device raises the possibility of netbooks using the energy efficient ARM processor (Wikipedia) - the same processor that is used (as far as I understand) in the iPhone and iPod I, for one, look forward to using a netbook that doesn't unbearably heat up my lap and allows me to do useful work without having to needless wasted time searching for power sockets.

My desktop computer (which was assembled by my own fair hands) produces a side effect that is undeniably useful during the winter months: it perceptibly heats up my room almost allowing me to totally dispense with other forms of heating completely (but I must add that a chunky jumper is often necessary). When I told someone else about this phenomenon, I was asked, 'big computer or small room?' The answer was, inevitably, 'small room' (and small computer).

Google

On a related note, I was recently sent a link to a YouTube video entitled Google container data centre tour. It was astonishing (and very interesting!) It was astonishing due to the sheer scale of the installation that was presented, and interesting in terms of the industrial processes and engineering that were described. It reminded me of a news item that was featured in the media earlier this year that related to the carbon cost of carrying out a Google search.

The sad thing about the Google data centre (and, of course, most power plants) is that most of the heat that is generated is wasted. I recently came across this article, entitled Telehouse to heat homes at Docklands. Apparently there are other schemes to use data centres for different kinds of heating.

Before leaving Google alone, you might have heard of a site called Blackle. Blackle takes the Google homepage and inverts it. The argument is that if everyone uses a black search page, large power savings can be made.

Mark Ontkush describes the story of Black Google and others in a very interesting blog post which also mentions other useful ideas, such as the use of Firefox extensions. Cuil is another search engine (pronounced 'cool') that embodies the same idea.

Carbon Cost of Spam

I recently noticed a news item entitled Spam e-mails killing the environment (ITWorld). Despite the headline having a passing resemblance to headlines that you would find on the Daily Mail, I felt the article was worth a look. It references a more sensibly titled report, The carbon footprint of email spam, published by McAfee.

The report is interesting, pointing towards the fact that that we may spend a lot of time both reading and processing junk emails that end up in our inbox. The article has an obvious agenda: to sell spam filters. An effective spam filter, it is argued, can reduce the amount of time that email users spend processing spam, thus helping to save the planet (a bit). Spam can fill up email servers, causing network administrators to use bigger disks. To be effective, email servers need to spend time (and energy) filtering through all the messages that are received. I do sense that more research is required.

Invisible Infrastuctures

There is a further connection between the challenge of spam and the invisible infrastucture of the internet. Messages to your PC, laptop or mobile device pass through a range of mysterious switches, routers and servers. At each stage, energy is mysteriously consumed and paid for by an invisible set of financial transactions.

My own PC, I should add, is not as power friendly as it could be. It contains two hard disk drives: a main drive that contains the operating system, and a secondary drive that contains backup files and also 'swap' area. The main reason for the second drive is to gain a performance boost.

Lower power PCs

After asking the question, 'how might I create an energy efficient PC', I discovered an interesting article from Ars Technica entitled It's easy being green. It describes each of the components of a PC and considered how much power they can draw. The final page features a potential PC setup in the form of 'an extreme green box'.

It is, however, possible to go further. The Coding Horror blog presents one approach: use kit that was intended for embedded systems - a domain where power consumption is high on the design agenda. An article, entitled Building Tiny, Ultra Low Power PCs is a fun read.

Both articles are certainly worth a view. One other cost that should be considered, however, is the cost of manufacturing (and also recycling) your existing machine. I don't expect to change my PC until the second service pack for Windows 7 is released. It's going to be warming my room for quite some time, but perhaps the carbon consumption stats that relate to PC manufacture and disposal are out there somewhere that may help me to make a decision.

Concluding thoughts

Servers undeniably cost a lot of money not only in terms of their initial purchase price, but also in terms of how much energy they consume over their lifetime.

Efficient software has the potential to reduce server count, allowing more to be achieved with less. Developers should aspire to write code that is as efficient as possible, and take careful account of the underlying software infrastructures (and abstractions) that they use. At the heart of every software development lies a range of challenging compromises. It often takes a combination of experience and insight to figure out the best solution, but it is important to take account of change since the majority of the time on any software system is likely to be during the maintenance phase of a software project.

The key to computing energy reduction doesn't only rest with computer scientists, hardware designers and software engineers. There are wider social and organisational issues at play, as Samson hints at in an article entitled No good excuses not to power down PCs. The Open University has a two page OU Green computing guide that makes a number of similar points.

One useful idea is to quantify computer power in terms of megahertz per miliwatt (MPMs) instead of millions of instructions per second (MIPS) - I should add that this isn't my idea and I can't remember where it came from. It might be useful to try to establish a new aspirational computing 'law'. Instead of constantly citing Moore's law which states that the number of transistors should double every two years, perhaps we need to edge towards trying to propose a law that proposes a reduction in power consumption whilst maintaining 'transactional performance'. In this era of multi-core multi-function processors, this is likely to be a tough call, but I think it's worth a try.

One other challenge is whether it might be possible to crystalise what is meant by 'green code', and whether we can explore what it means by constructing good or bad examples. The good examples will run on low powered slower hardware, wherease the bad examples will likely to be sluggish and unresponsive. Polling (constantly checking to see whether something has changed) is obviously bad. Ugly, inelegant, poorly structured and hard to read (and hard to change) code could also be placed in a box named 'bad'.

A final challenge lies with whether it might be possible to explore what might be contained within a sub-discipline of 'green computing'. It would be interesting to see where this might take us.

Permalink
Share post
Christopher Douce

Source code accessibility through audio streams

Visible to anyone in the world
Edited by Christopher Douce, Wednesday, 28 June 2023, 10:28

A screenshot of some source code being edited by a software developer

One of my colleagues volunteers for the Open University audio recording project.  The audio recording project takes course material produced by course teams and makes audio (spoken) equivalents for people with visual impairments.  Another project that is currently underway is the digital audio project which aims to potentially take advantage of advances in technology, mobile devices and international standards.

Some weeks ago, my colleague tweeted along the lines of 'it must be difficult for people with visual disabilities to learn how computer programs are written and structured' (I am heavily paraphrasing, of course!)  As soon as I read this tweet I began to think about two questions.  The first question was: how do I go about learning how a fragment of source code works? and secondly, what might be the best way to convert a function or a 'slice' of programming code into an audio representation that helps people to understand what it does and how it is structured?

Learning from source code

How do I learn how a fragment of source code works?  More often than not I view code through an integrated development environment, using it to navigate through the function (or functions) that I have to learn.  If I am faced with some code that is really puzzling I might reach for some search tools to uncover the connections between different parts of the system.

If the part of the code that I am looking at is quite small and extremely puzzling, I might go as far as grab a pen and paper and begin to sketch out some notes, taking down some of the expressions that may appear to be troubling and maybe split these apart into their constituent components.  I might even try to run the various code fragments by hand.  If I get really confused I might use the 'immediate' window of my development environment ask my computer to give me some hints about the code I am currently examining.

When trying to understand some new source code my general approach is to try to have a 'conversation' with it, asking it questions and looking at it from a number of different perspectives.  In the psychology of programming literature some researchers have written about developers using 'top down' and 'bottom up' strategies.  You might have a high level hypothesis about what something does on one hand, but on the other, sections of code might help you to understand the 'bigger picture' or the intentions behind a software system.

In essence, I think understanding software is a really hard task.  It is harder and more challenging than many people seem to imagine.  Not only do you have to understand the language that is used to describe a world, but you also have to understand the language of the world that is described.  The world of the machine and the world of the problem are intrinsically and intimately connected through what can sometimes seem an abstract collection of words and symbols.  Your task, as a developer, is to make sense of two hidden worlds.

I digress slightly... If learning about computer programming code is a hard task, then it is possible that it is likely to be harder for people with visual impairments.  I cannot imagine how difficult it must be to be presented with a small computer program or a function that has been read out to you.  Much of the 'secondary notation', such as tabbing and white space can be easily lost if there are no mechanisms to enable them to be presented through another modality.  There is also the danger that your working memory may become quickly overwhealmed with names of identifiers and unfamiliar sounding functions.

Assistive technology for everyone

The tasks of learning the fundamentals of programming and learning about a program are different, yet related.  I have heard it said that people with disabilities are given real help if technologies are created that are useful for a wide audience.  A great example of this is, for example, optical character recognition.  Whilst OCR technology can save a great deal of cost typing, it has also created tools that enable people with low vision to scan and read their post.

Bearing the notion of 'a widely applicable technology' in mind, could it be possible to create a system that creates an interactive audio description that could potentially help with the teaching of some of the concepts of computer programming for all learners?

Whenever I read code I immediately begin to translate the notion of code into my own 'internal' notation (using different types of memory, both internal and external - such as scraps of paper!) to iteratively internalise and make sense of what I am being presented with.  Perhaps equivalents of programming code could be created in a form that could be navigated.  Code it not something that you read in a linear fashon - code is something you work with.

If an interesting and useful (and interactive) audio equivalent of programming code could be created there then might be the potential that these alternative forms might be useful to all students, not only to learners who necessarily require auditory equivalents.

Development directions

There are a number of tools that could help us to create what might amount to 'interactive audio descriptions of programming code'.  The first is the idea of plan or schema theory (wikipedia) – the notion that your understanding of something is drawn from previous experience.  Some theorists from the Psychology of Programming have extended and drawn upon these ideas, positing ideas such as key lines of code such as beacons.

Another is Green's Cognitive Dimensions framework (wikipedia).  Another area to consider looking at is the interesting sub-field of Computer Science Education research.  There must be other tools, frameworks and ideas that can be drawn upon.

Have you got a sec?

Another approach that I sometimes take when trying to understand something is that I ask other more experienced people for help.  I might ask the question, 'what does this section represent?' or, 'what does this section do?'  The answers from collegues can be instrumental in helping me to understand the purpose behind fragments of programming code.

Considering browsing

I can almost imagine what could be an audio code browser that has some functionality that allows you to change between different levels of abstraction.  At one level, you may be able to navigate through sets of different functions and hear descriptions of what they are intended to do and hope to receive by way of parameters (which could be provided through comments).  On another level there may be summaries of groups of instructions, like loops, with descriptions that might sound like, 'a foreach loop that contains four other statements and a call to two functions'.  Finally, you may be able to tab into a group of statements to learn about what variables are manipulated, and how.

Of course this is all very technical stuff, and it could be stuff that has already been explored before.  If you know of similar (or related) work, please feel free to drop me a line!

Acknowledgement: random image of code by elliotcable, licenced under creative commons, discovered using Flickr.

Permalink
Share post
Christopher Douce

Exploring Moodle forums

Visible to anyone in the world
Edited by Christopher Douce, Wednesday, 21 July 2010, 18:08

A set of spanners loosely referring to moodle tools and debugging utilities

Following on from the previous post, this post describes my adventures into the Moodle forums source code.

Forums, I understand, can be activities (a Moodle term) that can be presented within individual weeks or topics. I also know that forums can be presented through blocks (which can be presented on the left or right hand side of course areas).

To begin, and remembering the success that I had when trying to understand how blocks work, I start by looking at what the database can tell me and quickly discover quite a substantial number of tables.  These are named: forum (obviously), forum_discussions, forum_posts, forum_queue, forum_ratings (ratings is not something that I have used within the version of Moodle that I am familiar with), forum_read, forum_descriptions, forum_subscriptions and forum_track_prefs.

First steps

Knowing that some of the data tables are called, I put aside my desire to excitedly eyeball source code and sensibly try to find some documentation.

I begin by having a look at the database schema introduction page (Moodledocs), but find nothing that immediately helps.  I then discover an end user doc page that describes the forum module (and the different types of forum that are on offer in Moodle).  I then uncover a whole forum documentation category (Moodledocs) and I'm immediately assaulted by my own lack of understanding of the capabilities system (which I'll hopefully blog about at some point in the future – one page that I'll take note of here is the forum permissions page).

From the forums category page I click on the various 'forum view pages', which hints that there are some strong connections with user settings.

Up to this point, what have I learnt?

I have learnt that Moodle permits only certain users to carry out certain actions to Moodle forums.  I have also learnt that Moodle forums have different types.  These, I am lead to believe (according to this documentation page) are: standard, single discussion, each person posts one discussion, and question and answer.  I'm impressed:  I wasn't expecting so much functionality!

So, can we discover any parallels with the database structures?

The forum table contains fields which are named: course, type, name, description followed by a whole other bunch of fields I don't really understand.  The course field associates a forum with a course (I'm assuming that somewhere in the database there will be some data that connects the forum to a particular part or section of a course) and the type (which is interestingly, an enumerated type) which can hold data values that roughly represents the forum types that were mentioned earlier.

A brief look at the code

I remember that the documentation that I uncovered told me that the 'forums' was a module. In the 'mod' directory I see notice a file called view.php.  Other interesting files are named: post.php, lib.php, search.php and discuss.php.  View.php seems to be one big script which contains a big case statement in the middle.  Post.php looks similar, but has a beguiling sister called post_form which happens to be a class.  Lib, I discover, is a file of mystery that contains functions and fragments of SQL and HTML.  Half of the search file seems to retrieve input parameters, and discuss is commented as, 'displays a post, and all the posts below it'.

Creating test data

To learn more about the data structures I decide to create some test data by creating a forum and making a couple of posts.  I open up an imaginatively titled course called 'test' and add an equally imaginatively titled forum called 'test forum'.  When creating the forum I'm asked to specify a forum type (the options are: single simple discussion, Q and A forum, standard forum for general use).  I choose the standard forum and choose the default values for aggregate type and time period for blocking.  The aggregate type appears to be related to functionality that allows students to grade or rate posts.

When the forum is live, I then make a forum post to my test forum that has the title 'test post'.

Reviewing the database

The action of creating a new forum appears to have created a record in the forum table which is associated to a particular course, using the course id.  The act of adding a post to the test forum has added data to forum_discussions, where the name field corresponds to the title of my thread: 'test post'.  A link is made with the forum table through a foreign key, and a primary key keeps track of all the discussions held by Moodle.

The forum_posts table also contains data.  This table stores the text that is associated with a particular post.  There is a link to the discussion table through a discussion id number.  Other tables that I looked at included forum_queue (not quite sure what this is all about yet), forum_ratings (which probably stores stuff depending on your forum settings), and forum read, which simply stores an association between user id, forum id, discussion id and post id.

One interesting thing about forums is that they can have a recursive structure (you can send a reply to a reply to a reply and so on).  To gain more insight into how this works, I send a reply to myself which has the imaginative content, 'this is a test post 2'.

Unexpectedly, no changes are made to the forum_discussions table, but a new entry is added to the forum_posts table.  To indicate hierarchy a 'parent' field is populated (where the parent relates to an earlier entry within the forum_posts table).  I'm assuming that the sequence of posts is represented by the 'created' field which stores a numerical representation of the time.

Tracing the execution flow

These experiments have given me with three questions to explore:

  1. What happens within the world of Moodle code the user creates a new forum?
  2. What happens when a user adds a new discussion to a forum?
  3. What happens when a user posts a reply?

Creating a new forum

Creating a new forum means adding an activity.  To learn about what code is called when a forum is added, I click on 'add forum' and capture the URL.  I then give my debugger the same parameters that are called (id, section, sesskey and add) and then begin to step through the course/mod.php script.  The id number seems to relate to the id of the course, and the add parameter seems to specify the type of the activity or resource that is to be added.

I quickly discover a redirect to a script called modedit.php, where the parameters add=forum, type= (empty), course=4, section=1, return=0.  To further understand what is going on, I stop my debugger and start modedit.php with these parameters.

There is a call to the database to check the validity of the course parameter, fetching of a course instance, something about the capability, fetching of an object that corresponds to a course section (call to get_course_section in course/lib code).   Data items are added to a $form variable (which my debugger tells me is a global).  There is then the instantiation of a class called mod_forum_mod_form (which is defined within mod/forum/mod_form.php).  The definition class within mod_forum_mod_form defines how the forum add or modification form will be set out.  There is then a connection between the data held within $form and the form class that stores information about what information will be presented to the user.

After the forum editing interface is displayed, the action of clicking the 'save and return to course' (for example) there is a postback to the same script, modedit.php.  Further probing around reveals a call to forum_add_instance within forum/lib.php (different activities will have different versions of this function) and forum_update_instance.  At the end of the button clicking operation there is then a redirect to a script that shows any changes that have been made.

The code to add a forum to course will be similar (in operation) to the code used to add other activities.  What is interesting is that I have uncovered the classes and script files that relate to the user interface forms that are presented to the user.

Adding a new discussion

A new discussion can be added by clicking on the 'Add a new discussion topic' button once you are within a forum.  The action of clicking on this button is connected to the forum/post.php script.  The most parameter associated to this action is the forum number (forum=7, for example).

It's important to note the use of the class mod_frum_post_form contained within post_form.php which represents the structure of the form that the user enters discussion information to.

The code checks the forum id and then finds out which course it relates to.  It then creates the form class (followed by some further magic code that I quickly stepped through).

The action of clicking on the 'post to forum' button appears to send a post back (along with all of the contents of the form) to post.php (the same script used to create the form).  When this occurs, a message is displayed and then a redirect occurs to the forum view summary.  But where in the code is the database updated?  One way to do this is to begin with a search to the redirect.  Whilst browsing through the code I stumble across a comment that says 'adding a new discussion'.  The database appears to be updated through a call to forum_add_discussion.

Posting a reply to a discussion

The post.php script is also used to save replies to discussions (as well as adding new discussions) to the database.  When a user clicks on a discussion (from a list of discussions created by discuss.php), the link to send replies are represented by calls to post.php with a reply parameter (along with a post number, i.e. post.php?reply=4).  The action of clicking on this link presents the previous message, along with the form where the user can enter a response.

Screen grab of user sending a reply to a forum discussion

To learn more about how this code works, I browse through the forums lib file and uncover a function called forum_add_new_post.  I then search for this in post.php and discover a portion of code that handles the postback from the HTML form.  I don't explore any further having learnt (quite roughly) where various pieces of code magic seems to lie.

Summary

The post.php script does loads of stuff.  It weighs in at around seven hundred lines in length and contains some huge conditional statements.

Not only does post appear to manage the adding of new discussions to a forum but it also appears to manage the adding, editing and deletion of forum messages.  To learn about how this script is structured I haven't been able to look at function definitions (because it doesn't contain any) but instead I have had to read comments.  Comments, it has been said, can lie, whereas code always tells the truth.  More functions would have helped me to more quickly learn the structure of the post.php script.

The creation of the user interfaces is partially delegated to the mod and post form classes.  Database updates are performed through the forum/lib.php file.  I like some of the function abstractions that are beginning to emerge but any programming file that contains both HTML and SQL indicates there is more work to be done.  The reason for this aesthetic (and person) opinion is simple: keeping these two types of code separate has the potential to help developers to become quickly familiar where certain types of software actions are performed.  This, in turn, has the potential to save developer time.

One of the central areas of functionality that forum developers need to understand is how Moodle works and uses forms.  This remains an area of mystery to me, and one that I hope to continue to learn about.  Another area that I might explore is how PHP has been used to implement different forum systems so I can begin to get a sense of how PHP is written by different groups of developers.

Acknowledgements: Photograph licenced under creative commons by ciaron, liberated from Flickr.

Permalink
Share post
Christopher Douce

Forums 2.0

Visible to anyone in the world
Edited by Christopher Douce, Tuesday, 20 May 2014, 09:52

I like forums, I use them a lot.  I can barely remember when I didn’t know what one was.  I think my first exposure to forums might have been through a dial-up bulletin board system (used in the dark ages before the internet, of course).  This was followed through a brief flirtation with usenet news groups.

When trying to solve some programming problems, I more often than not would search for a couple of keywords and then stumble across a multitude of different forums where tips, tricks and techniques might be debated and explored.  A couple of years ago I was then introduced to the world of FirstClass forums (wikipedia) and then, more recently, to Moodle forums.  Discussions with colleagues has since led me towards the notion of e-tivities.

I have a confession to make: I use my email account for a whole manner of different things.  One of the things that I incidentally use my email account for is sending and receiving email!  I occasionally use email as a glorified ‘todo’ list (albeit one that has around a thousand items!)  If something comes in that is interesting and needs attention, I might sometimes use click on an ‘urgent’ tick box so that I remember to look at the message again at a totally unspecified time in the future.  If it is something that must be bounded by time, I might drag the item into my calendar and ask my e-mail client to remind me about it at a specified time in the future (I usually ponder over this for around half a minute before choosing one of two options: remind me in a weeks time, or remind me in a fortnight).

I have created a number of folders within my email client where I can store interesting stuff (which I very often subsequently totally forget about).  Sometimes, when working on a task, I might draft out some notes using my email editor and them store them to a vaguely titled folder.

The ‘saving of draft’ email doesn’t only become something that is useful to have when the door knocks or the telephone rings – email, to me, has gradually become an idea and file storage (and categorisation) tool that has become an integral part of how I work and communicate.  I think I have heard it said that e-mail is the internet’s killer application (wikipedia).  For me, it is a combined word processor, associative filing cabined, ideas processor and general communications utility.

Returning to the topic of forums… Forums are great, but they are very often nothing like email.  I can’t often click and drag forum messages from one location into folder or to a different part of the screen.  I can’t add my own comments to other people’s posts that only I can see (using my mail client I can save copies of email that other people send me).  On some forum systems I can’t sort the messages using different criteria, or even search for keywords or phrases that I know were used at some point.

My forum related gripes continue: I cannot delete (or at least) hide the forum message that I don’t want to see any more.  On occasions I want to change the ‘read status’ from ‘read’ to ‘unread’ if I think that a particular subject that is being discussed might be useful to remember when I later turn to an assessment that I have to submit.  I might also like to take fragments of different threads and group them together in a ‘quotation set’, building a mini forum centric e-portfolio of interesting ideas (this said, I can always copy and paste to email!)If a forum were like a piece of paper where you could draw things at any point I might want to put some threads on the left of the page (those points that I was interested in) and others on the right of the page (or visa-versa).

I might want to organise the threads spatially, so that the really interesting points might be at the top, or the not so interesting points at the bottom – you might call this ‘reader generated threading!’  When one of my colleagues makes a post, there might be an icon change that indicates that a contribution has been made against a particular point.

I might also be able to save thread (or posting) layout, depending on the assignment or topic that I am currently performing research.  It might be possible to create a ‘thread timeline’ (I have heard rumours that Plurk might do something like this), where you see your own structured representation of one or more forums change over time.  Of course, you might even be able to share your own customised forumscape with other forum users.

An on-line forum is undoubtedly a space where learning can occur.  When we think about how we might further develop the notion of a forum we soon uncover the dimension of control.

Currently, the layout and format of a forum (and what you can ultimately do with it) is ultimately constrained by the design of the forum software and a combination of settings assigned by an administrator.  Allowing forum users to create their own customised view of a forum communication space may allow learners tools to make sense of different threads of communication.  Technology can be then used to enable an end user to formulate a display that most effectively connects new and emerging discussions with existing knowledge.

This display (or forumscape) might also be considered as a mask.  Since many different discussions can occur on a single forum at the same time choosing the right mask may help salient information become visible.

The FirstClass system, with its multiple discussion areas and the ability to allow the end user to change the locations of forum icons on a ‘First Class’ desktop begins to step toward some of these ideas.

Essentially, I would like discussion forums to become more like my email client: I would like them to do different things for me.  I would like forum software to not only allow users to share messages.  I would like forum software to become richer and permit the information they display to the users be more malleable (and manageable).  I know this would certainly be something that would help me to learn!

Acknowlegements: Picture from Flickr taken by stuckincustoms, licenced under creative commons.

Permalink 1 comment (latest comment by Sam Marshall, Thursday, 5 Feb 2009, 12:30)
Share post
Christopher Douce

How Moodle block editing works: database (part 2)

Visible to anyone in the world
Edited by Christopher Douce, Wednesday, 21 July 2010, 18:05

Pattern of old computer tapes intended to represent databases

This is a second blog entry about how Moodle manages its blocks (which can be found either at a site level or at a course level).  In my previous post I wrote about the path of execution I discovered within the main Moodle index.php file.  I discovered that the version of Moodle that I was using presented blocks using tables, and that blocks made use of some interesting object-oriented features of PHP to create the HTML code that is eventually presented to the end user.

This post has two objectives.  The first is to present something about the database structures that are used to store information about which blocks are stored where, and secondly to explore what happens when an administrator clicks on the various block editing functions.  The intention behind this post is to understand Moodle in greater detail to uncover a little more of how it has been designed.

Blocks revisited

Screen grab of the latest news block with moving and deletion editing icons

Blocks, as mentioned earlier, are pieces of functionality that can sit on the left hand or right hand borders of courses (or the main Moodle site page).  Blocks can present a whole range of functions ranging from news items through to RSS feeds.

Blocks can be moved around within a course page with relative ease by using the Moodle edit button.  Once you click on ‘edit’ (providing it is there and you have the appropriate level of permissions), you can begin to add, remove and move blocks around using a couple of icons that are presented.  Clicking on the left icon moves the block to the left hand margin, clicking the down arrow icon changes its vertical position and so on.

One of my objectives with this post is to understand what happens when these various buttons are clicked on.  What I am hoping to see are clearly defined functions which will be called something like moveBlockUp, moveBlockDown or deleteBlock.

Perhaps with future versions it might be possible to have a direct manipulation interface (wikipedia) where rather than having buttons to press, users will be able to drag blocks around to rapidly customise course displays.  Proposing ideas and problems to be solved is a whole lot more easier than going ahead and solving them.  Also, to happily prove there’s no such thing as an original thought, I have recently uncovered a Moodle documentation page.  It seems that this idea has been floating around since 2006.

Before I delve into trying to uncover how each of the Moodle block editing buttons work, it is worthwhile spending some time to look at how Moodle remembers what block is placed where.  This requires looking at the database.

Remembering block location

I open up my database manipulation tool (SqlYog) and begin to browse through the database tables that are used with Moodle.  I quickly spot a bunch of tables that contain the name block.  One that seems to be particularly relevant is a table called block_instance.

The action of creating a course (and adding blocks to it) seems to create a whole bunch of records in the block_instance.  Block_instance appears to be the table that Moodle uses to remember what block should be displayed and when.

The below graphic is an excerpt from the block_instance data table:

Fragment of the block_instance datatable showing a number of different fields

The field weight seems to relate to the vertical order of blocks on the screen (I initially wondered whether it related to, in some way, some kind of graphical shading, thinking of the way that HTML uses the term weight).  Removing a block from a course seems to change the data within this table.

The blockid seems to link each entry within block_instance to data items held within the  Block table:

Fragment of the blocks table, showing the field headings and the data items

The names held within the name field (such as course_summary) are connected to the programming code that relates to a particular block.  The cron (and the lastcron) relate to regular processes that Moodle must execute.  With the default installation of Moodle everything is visible, and at the time of writing I have no idea what multiple means.

Returning to block_instance, does the pageid field relate to the id used in the course?  Looking at the course table seems to add weight to his hypothesis.

I continue my search for truth by rummaging around in the Moodle documentation, discovering a link to the database schema and uncover some Block documentation that I haven’t seen before (familiarity with material is a function of time!)  This provides a description of the block development system as described by the original developer.

Knowing that these two tables are now used to store block location my question from this point onwards is: how does this table get updated?

Database updates

To answer this question I applied something that I have called ‘the law of random code searching’: if you don’t know what to look for and you don’t know how things work, carry out a random code search to see what the codebase tells you.  Using my development environment I search to find out where the block_instance datatable is updated.

Calls to the database to be spread out over a number of files: blocks, lib, accesslib, blocklib, moodlelib, and chat/lib (amongst others).  This seems to indicate that there is quite a lot of coupling between the different sections of code (which is probably a bad thing when it comes to understanding the code and carrying out maintenance).

Software comprehension is sometimes an inductive process.  Occasionally you just need to read through a code file to see if it can yield any clues about its design, its structure and what it does.  I decided to try this approach for each of the files my search results window pointed to:

Accesslib
Appears to access control (or permission management) to parts of Moodle.  The comments at the top of the file mention the notion of a ‘context’ (which is a badly overloaded word).  The comments provide me no clue as to the context in which context is used.  The only real definition that I can uncover is the database description documentation which states, ‘a context is a scope in Moodle, for example the whole system, a course, a particular activity’.  In AccessLib, there are some hardcoded definitions for different contexts, i.e. CONTEXT_SYSTEM, CONTEXT_USER, CONTEXT_COURSECAT and so on.

The link to the blocks_instance database lies within a huge function called create_context which updates a database table of the same name.  I’ve uncovered a forum explanation that sheds a little more light onto the matter, but to be honest, the purpose of these functions is going to take some time to uncover.  There is a clue that the records held within the context table might be cached for performance reasons.  Moving on…

Moodlelib

Block_instance is mentioned in a function named remove_course_contents which apparently ‘clears out a course completely, deleting all content but don’t delete the course itself’.  When this function is called, modules and blocks are removed from the course.  Moodlelib is described as ‘main library file of miscellaneous general-purpose Moodle functions’ (??), but there is a reference towards another library called weblib which is described as ‘functions that provide web output’.

Blocks
A comment at the top of the blocks.php file states that it ‘allows the admin to configure blocks (hide/show, delete and configure)’.  There is some code that retrieves instances of a block and then deletes the whole block (but in what ‘context’ this is done, at the moment it’s not clear).

Blocklib
The file contains the lion’s share of references to the block_instance database.  It is said to include ‘all the necessary stuff to use blocks in course pages’ (whatever that means!)  At the top there are some constants for actions corresponding to moving a block around a course page.  Database calls can be found within blocks_delete_instance, blocks_have_content, blocks_print_group and so on.  The blocks_move_block seems to adjust the contents of the database to take account of moment.  There also appears to be some OO type magic going on that I’m not quite sure about.  Perhaps the term ‘instance’ is being used in too many different ways.  I would agree with the coder: blocklib does all kinds of ‘stuff’.

Lib files
Reference to block_instance can be found in lib files for three different blocks: chat, lesson and quiz.  The functions that contain the call to the database relate to the removing of an ‘instance’ of these blocks.  As a result, records from the block_instance table are removed when the functions are called.

So, what have I learnt by reading all this stuff?  I’ve seen how the database stores stuff, that there is a slippery notion of a course context (and mysterious paths), and know the names of some files that do the block editing work, but I’m not quite sure how.  There is quite a lot of complexity that has not yet been uncovered and understood.

Digressions

I have a cursory glance through the lib folder to see what else I can discover and find an interestingly named script file entitled womenslib.php.  Curious, I open it and see a redirect to a wikipedia page.  The Moodle developers obviously have a sense of humour but unfortunately mine had failed!  This minor diversion was unwelcome (humour failure exception), costing me both time and ‘head’ space!

Bizarrely I also uncover seemingly random list of words (wordlist.txt) that begins: ‘ape, baby, camel, car, cat, class, dog, eat …’ etc.  Wondering whether one of the developers had attended the famous Dali school of software engineering, I searched for a file reference to this mysterious ‘wordlist’.

It appeared that our mysterious list of words was referenced in the lib\setup.php file, where a path to our  worldlist was stored in what I assumed to be a Moodle configuration variable.  How might this file be used?  It appears it is used within a function called generate_password.

Thankfully the developers have been kind enough to say where they derived some of their inspiration from.   The presence of the wordlist is explained by the need to construct a function to create pronounceable automatically generated passwords (but perhaps only in English?)

This was all one huge digression.  I pulled myself together just enough to begin to uncover what happens when a user clicks on either the block move up, down, or delete buttons when a course is running in edit mode.

Button click action

Returning to the task in hand, I add two blocks (both in the right hand column, and one situated on top of the other) to my local Moodle site with a view to understanding that function code that contributes to the moveBlockUp and deleteBlock functionality.

4815869094_9c27f1aaf4.jpg

I take a look at the links that correspond to the move up and the delete icons.  I notice that the action of clicking sends a bunch of parameters to the main Moodle index.php.  The parameters are sent via get (which means they are sent as a part of the hypertext link).  They are: instanceid (which comes straight out of the block_instance table), sesskey (which reminds me, I really must try to understand how Moodle handles sessions (wikipedia) at some point), and a blockaction parameter (which is either moveup or delete in the case of this scenario).

The question here is: what happens within index.php?  Luckily, I have a debugger that will be able to tell me (or, at least, help me!)

I log in as an administrator through my debugger.  When I have established a session, I then add some breakpoints on my index.php code and launch the index.php code using the parameters for ‘move activity upwards’.

Index.php begins to execute, and a call to page_create_object is made. It looks like a new object is created.  An initialisation function within the page_base class is called (contained within pagelib).  A blocks_setup function is called and block positions from the block_instance database is retrieved.  After some further tracking I end up at a function called blocks_execute_url_action.  The instanceid is retrieved and a call is made to blocks_execute_action where the block action (moveup or delete) is passed in as a parameter with the block instance record that has just been retrieved from the database.

In blocks_execute_action a 'mother of all switch statements' makes a decision about what should be done next.  After some checks, two update commands to the database are issued through the update_record function updated weight values (to change the order of the respective blocks).  With all the database changes complete, a page redirect occurs to index.php.  Now that the database has the correct representation of where each block should be situated index.php can now go ahead and display them.

Is the same mechanism used for course pages?

A very cursory understanding tells me that the course/view.php script has quite a lot to do with the presentation of courses, and at this point gathering an understanding of it is proving to be elusive.  Let’s see what I can find.

4815245535_c6182b86b1.jpg

Initially it does seem that the index.php script controls the display of a Moodle site and course/view.php script does control the course display.  Moving the mouse over the ‘move block up’ icons reveals a hyperlink to the view.php script with get parameters of: id (which corresponds to the course number held within the course data table), instance id (which corresponds to a record within the block_instance table) and sesskey and blockaction parameters (as with index.php).

To get a rough understanding of how things work, I do something similar as before: open up a session through my debugger and launch the view.php with this bunch parameters.  The view.php course is striking.  It doesn’t seem to be very long and nor does it produce any HTML so it looks like there’s something subtle going on.

In view.php, there are some parameter safety checks, followed by some context_instance magic, checking of the session key followed by calls to the familiar page_create_object (mentioned in the earlier section).  Blocks_setup is then called, followed by blocks_get_by_page_pinned and blocks_get_by_page which asks the database which blocks are associated to this particular page (which is a course page).

Like earlier, there is a call to blocks_execute_url_action when updates the database to carry out the action that the administrator clicked on.  At the end of the database update there is a redirect.  Instead of going to index, the redirect is to view.php along with a single parameter which corresponds to the course id.

This raises the question: what happens after the view.php redirect?

Redirect to view.php

When view.php makes a call to the database to get the data that corresponds to the course id number it has been given.  There is then a check to make sure that the user who is requesting the page is logged into Moodle and eventually our old friends page_create_object and blocks_setup are called, but this time since no buttons have been clicked on, we don’t redirect to another page after we have updated the database.

Towards the end of view.php we can begin to see some magic that begins to produce the HTML that will be presented to the user.  There is a call to print_header.  There is then a script include (using the PHP keyword ‘required’) which then creates the bulk of the page that is presented to the user, building the HTML to present the individual blocks.  When running within my debugger, the script course/format/weeks/format.php was included.  The script that is chosen depends on the format of the course that has been chosen.  When complete, view.php adds the footer and the script ends.

Summary

So, what have I learnt from all this messing about?

It seems that (broadly speaking) the code used to move blocks around on the main Moodle site is also used to move blocks around on a course page, but perhaps this isn’t too surprising (but it is reassuring).  I still have no idea what ‘pinned blocks’ means or what the corresponding data table is for but I’m sure I’ll figure it out in time!

Another thing that I have learnt is that the view course and the main index.php pages are built in different ways.  As a result, if I ever need to change the underlying design or format of a course, I now know where to look (not that I ever think this is something that I’ll need to do!)

I have seen a couple of references to AJAX (MoodleDocs) but I have to confess that I am not much wiser about what AJAX style functionality is currently implemented within the version of Moodle I have been playing with.  Perhaps this is one of those other issues that will become clearer with time (and experience).

One thing, however, does strike me: the database and the user interface components are very closely tied together (or closely coupled) which may make, in some cases, change difficult.  One of the things that I have on my perpetual ‘todo’ list is to have a long hard look at the Fluid Project, but other activities must currently take precedence.

This pretty much concludes my adventure into the world of Moodle blocks. There’s a whole load of Moodle related stuff that I hope to look at (and hopefully describe) at some point in the future: groups, roles, contexts, and forums.  Wish me luck!

Acknowlegements: Image from lifeontheedge, licenced under Creative Commons.

Permalink
Share post

This blog might contain posts that are only visible to logged-in users, or where only logged-in users can comment. If you have an account on the system, please log in for full access.

Total visits to this blog: 2350029