Personal Blogs

BCS Lecture: The Power of Abstraction

Monday 27 May 2013 at 16:56

Visible to anyone in the world

Edited by Christopher Douce, Friday 10 August 2018 at 14:41

When I was a graduate student at the University of Manchester (or the bit of it that was once known as UMIST) I was once asked to show some potential computer science students around the campus. At the end of the tour I ushered them to lecture which was intended to give the students a feel for what things would be like if they came to the university.

The lecture, given by one of the faculty, was all about the notion of abstraction. We were told that this was a fundamental concept in computing. In some respects, it felt less of a lecture about computing, but more of a lecture about philosophy. I had never been to a lecture quite like it and it was one that really stuck in my mind. When I left the lecture, I thought, 'why didn't I have this kind of lecture when I was an undergraduate?' As an undergrad I had spent many a hour creating various kinds of computer programs without really being told that there was an essential and fundamental idea that underpinned what I was doing.

When I saw the British Computer Society (BCS) advertising a lecture that was about the 'power of abstraction', I knew that I had to try to make time to come along. The lecture, by Professor Barbara Liskov, was an annual BCS lecture (the Karen Spärck Jones lecture) that honours women in computing research.

All this sounds great, right? But what, fundamentally, is abstraction? An 'abstract' at the top of a formal research paper says, in essence, what it contains. Abstraction, therefore, can be thought of as a process of creating a representation of something, and that something might well be a problem of some kind. Admittedly, this sounds both confusing and vague...

Barbara began her lecture by stating that abstraction is the basis of how we implement computer software. The real world is, fundamentally, a messy place. Since computers are ultimately mathematical machines, we need a way to represent problems (using, ultimately, numbers) so that a computer can work with them. As a part of her lecture, Barbara said that she was going to talk through some developments in the way that people (or computer programmers) could create and work with abstractions. I was intrigued; this talk wasn't just about a history of programming languages, it was also a history of thought.

So, what history was covered? We were immediately taken back to the 1970s. This was a period in computing history where the term 'software crisis' gained currency. One of the reasons was that it was becoming increasingly apparent that creating complex software systems was a fundamentally difficult thing to do. It was also apparent that projects were started, became excruciatingly late and then abandoned, costing astronomical amounts of money. (It might be argued that this still happens today, but that's a whole other debate which goes beyond this pretty short blog post).

One of the reasons why software is so fundamentally hard to create is that it is 'mind stuff'. Software isn't like a physical artefact or product that we can see. The relationships between components can easily become incredibly complicated which can, in turn, make things unfeasibly difficult. Humans, after all, have limited brain capacity to deal with complexity (so, it's important that we create tools and techniques that help us to manage this).

We were introduced to a number of important papers. The first paper was by Dijkstra, who wrote a letter to the Communications of the ACM entitled, 'Goto considered harmful'. 'Goto' is an instruction that can help to create very complicated (and unfathomable) software very quickly. Barbara described the difficulty very clearly. One of the reasons why software is so hard is that there is a fundamental disconnect between how the program text might be read by programmers and how it might be processed or executed by a machine. If we can create a program representation that tries to bridge the difference between the static (what is described should happen) and the dynamic (what actually happens when software does its stuff), then things would be a whole lot easier.

Another paper that was mentioned was Wirth's 'program development by stepwise refinement'. Wirth is famous for the design of two closely related languages: Pascal and Modula-2. It certainly is the case that it's possible to write software without the 'goto' instruction, but Barbara made the interesting point that it's also possible to write good, well-structured software in bad languages (providing that you're disciplined enough). The challenge is that we're always thinking about trade-offs (in terms of program performance and code economy), so we can easily be lured into doing clever things in incomprehensible ways.

Barbara spoke about the importance of modules whilst mentioning a paper by Parnas entitled, 'information distribution aspects of design methodology'. One of the great things about modules, other than that they can be used to group bits of code together, is that they enable the separation of the implementation and the interface. This has reminded me of some stuff from my undergrad days and time spent in industry: modules are connected to the term 'cohesion'. Cohesion is, simply, the idea that something should do only one thing. A function that has one name and does two or more things (that are not suggested in its name) is a recipe for confusion and disaster. But I fear I'm beginning to digress from the lecture and onto one of my 'coding hobbyhorses'.

Through a short mention of a language called Simula-67 (Wikipedia) we were then introduced to a paper by Liskov and Zilles entitled, 'programming with abstract data types'. We were told that this paper represented a sketch of a programming language which eventually led to the creation of a language called CLU (Wikipedia), CLU being short for Clusters.

There is one question Barbara clearly answered, which is: why go to all the trouble of writing a programming language? It's to understand whether an idea works in practice and to understand some of the barriers to performance. Also, whenever a language designer describes a language in natural language there are always going to be some assumptions that the compiler writer must make. Only by going through the process of creating a working language are language designers able to 'smoke out' any potential problems.

Just diverting into programming language speak for a moment, CLU implemented static type checking, used a heap, and doesn't support concurrency, the goto statement or inheritance. What it does implement is polymorphism (or the use of generics), iterators and exception handling.

Barbara also mentioned a very famous language called Smalltalk, developed by Alan Kay. Different developments at different times and at different places have all influenced the current generation of programming languages. Our current object-oriented languages enable programmers to define abstractions, or a representation of a problem in a way that wasn't possible during the earlier days of software.

Research directions

Barbara mentioned two research topics that continue to be of interest. The first was the question of what might be the most appropriate design of a programming language for novices? In various years, these have been BASIC (which introduced the dreaded Goto statement), Pascal, and more recently Java. Challenges of creating a language that helps learners develop computational thinking skills (Wikipedia) include taking account of programming language design trade-offs, such as ease of use vs. expressive power, and readability vs. writeability, and how to best deal with modularity and encapsulation.

Another research subject is languages for massively parallel computers. These days, PCs and tablets, more often than not, contain multiple processor cores (which means that they can, quite literally, be doing more than one calculation at once). You might have up to four cores, but how might you best design a programming language that more efficiently allows you to define and solve problems when you might have hundreds of processors working at the same time? This immediately took me back to my undergrad days when I had an opportunity to play with a language called Occam (Wikipedia).

There was one quote from Barbara's lecture that stood out (for me), and this was when she said, 'you don't get ideas by not working on things'.

Reflections

I should say at the point that I haven't done Barbara's speech justice. There were a whole lot of other issues and points that were mentioned but I haven't touched on. I really enjoyed being taken on a journey that described how programming languages have changed. I liked the way that the challenges of coding (and the challenge of using particular instructions) led to discussions about modules, abstract data types and then to, finally, object-oriented programming languages.

It's also possible to take a broader perspective to the notion of abstraction, one that has been facilitated by language design. During Barbara's lecture, I was mindful of two related subjects that can be strongly connected to the notion of abstraction. The first of these is the idea of design patterns.

Design patterns (Wikipedia) take their inspiration from architecture. Rather than design a new building from scratch every time you need to make one, why not buy a pre-existing design that has already solved some of the problems that you might potentially come up against? There is a strong parallel with software: developers often have to solve very similar problems time and time again. If we have a template to work from, we might arguably get things done more quickly and cheaply.

Developers can use patterns to gain inspiration about how to go about solving common problems. By using well understood and defined patterns, the communication between programmers and developers can be enhanced since abstract concepts can be readily named; they permit short-cuts to understanding.

In some cases, patterns can be embedded into pre-existing code that can be used by developers to kick-start a development. This can take the form of a framework, software code that solves well known problems that ultimately enables developers to get on and solve the problems that they really need to solve (as opposed to dealing with stuff such as reading and writing to databases).

Abstraction has come a long way in my own very short career as a developer. One of the biggest challenges that developers face is how to best break down a problem into structures that can be represented in a language that a machine can understand. Another challenge lies with understanding the various tools that developers now have at their disposal to achieve this.

Note: The logo at the top of the blog is used to indicate that this blog relates to a BCS event and this post is not connected with the BCS in any other way. All mistakes and opinions are my own, rather than that of the OU or the BCS.