On 4 June, I went to an event that was all about something called OpenStack. OpenStack is an open source software framework that is used to create cloud computing systems. The main purpose of this blog is to share my notes with some of my colleagues, but also to some of the people who I met during the conference. Plus, it might well be of interest to others too.
Cloud computing is, as far as I understand it, a broad terms that relates to the consumption and use of computing resources over a network. There are a couple of different types of cloud: there are public clouds (which are run by large companies such as Amazon and Google), private clouds (which are run by a single organisation), and hybrid clouds (which is a combination of public and private clouds). There’s also the concept of a community cloud - this is where different organisations come together and share a cloud, or resources that are delivered through a cloud.
This is all very well, but what kind of computing resources are we talking about? As far as I know, there are a couple. There’s software as a service (or SaaS). There’s PaaS, meaning, Platform as a Service, and there’s IaaS, which is Infrastructure as a Service. Software as a Service is where you offer software through a web page, and you don’t ever touch the application code underneath. Infrastructure as a Service is where you might be able to manage a series of ‘computers’ or servers remotely though the cloud. More often than not, these computers are running in something called virtual machines.
These concepts were pretty much prerequisites for understanding what on earth everyone was talking about during the day. I also picked up on a whole bunch of new terms that were new to me, and I’ll mention these as I go.
Opening Keynote : The OpenStack Foundation
Mark Collier opened the conference. Mark works for the OpenStack Foundation (OpenStack website). During his keynote he introduced us some of the parts that make up OpenStack (a storage part, a compute part and a networking part), and said that there is a new software release every six months. To date there are in the order of approximately 1.2k developers. The community was said to comprise of approximately 350 companies (such as RedHat, IBM, HP, RackSpace) and 16k individual members.
Mark asked the question: ‘what are we trying to solve?’ He then went onto quote Mark Andreessen who said, ‘software is eating the world’. Software, Mark said, is said to be transforming the economy and disrupting industries.
One of the most important tools in computer science is abstraction. OpenStack represents a way to create a software defined data centre (a whole new level of abstraction), which allows you to engineer flexibility to enable organisations to move faster and software systems to scale more quickly.
Mark mentioned a range of different companies who are using OpenStack. These could be considered to be superusers (and there’s a corresponding superuser page on the OpenStack website which presents a range of different case studies). Superusers include organisations such as Sony, Disney and Bloomberg, for example.
I remember that Mark said that OpenStack is a combination of open source software and cloud computing. Another link that I noted down was to something called the OpenStack marketplace (OpenStack website). Looking on this website shows a whole range of different Cloud distributions (many of which come from companies that offer Linux distributions).
Keynote: Canonical, Ubuntu and OpenStack
Mark Shuttleworth from Canonical (Canonical website) offered an industry perspective. Canonical develops and supports Ubuntu which is a widely used Linux distribution. (It is used, as far as I can remember in the TM129 Technologies in Practice module). As well as running on the desktop, Ubuntu is widely used on the server side, running within data centres. A statistic I’ve noted down is that Ubuntu accounts for ‘70% of guest workloads’. What this means is that we’re talking about instances of the Linux operating system that have been configured and packaged by Ubuntu (that are running on a server within a datacentre, somewhere).
A competitor to Ubuntu is another Linux distribution called CentOS. There is, of course, also Microsoft Windows Server. When you use public cloud networks, such as those provided by Amazon, I understand that you’re offered a choice of the operating system that you want to ‘host’ or run.
An interesting quote is, ‘building your cloud is a bit like building your own mainframe – users will always want it to be working’. We also heard of something called OpenStack Interoperability Laboratory. Clouds can be built hundreds of times a day, we were told – with different combinations of technology from different vendors. ‘Iteration is the only way to understand the optimal architecture for your use case’.
A really important aspect of cloud computing is the way that a configuration can dynamically adapt to changing circumstances (and user demands). The term for how this is achieved (in the cloud computing world) seems to be ‘orchestration’. In OpenStack, there is a tool called JuJu (Wikipedia). JuJu enables (through a dashboard interface) different combinations of components to be defined. There is a concept of a ‘charm’ (which was described as scripts which contain some operational coding). If you would like to look at what it is all about, there’s a website called JuJu Charms that I’ve yet to spend time exploring.
I’ve also noted down something called a Service Orchestration Framework, which lets you place services where you want, and on what services. There are some reference installations for certain types of cloud installations (which reminds me of the idea of ‘design patterns’ in software).
Mark referred to a range of different technologies during his talk, some of which I had only very briefly heard of. One technology that was referred to time and time again was the concept of the hypervisor (Wikipedia). I understand this to be a container (either hardware or software) that runs one or more virtual machines. Other terms that he mentioned or introduced include KVM (Kernel-based virtual machine), Ceph (a way to offer shared storage), and MaaS, or Metal as a Service (Ubuntu), which ‘brings the language of the cloud to physical servers’.
A further bunch of mind boggling technical terms that were mentioned include ‘lightweight hyppervisors’ such as LXC (LinuX Containers), Hadoop, which is a data storage framework, and TOSCA (Wikipedia), which is an abbreviation for Topology and Orchestration Specification for Cloud Applications. In terms of databases, some new (and NoSQL) technologies that were mentioned included MongoDB and Cassandra.
At this point, it struck me how much technologies have changed in such an incredibly short time, reminding me that we live in interesting times.
Keynote: Agile infrastructure built in OpenStack
The second keynote of the day was by John Griffith, Project Technical Lead, SolidFire. John’s presentation had the compelling subtitle: ‘building the next generation data centre with OpenStack’.
A lot of people started using Amazon, who I understand to be the most successful public cloud provider, to use IT resources more efficiently. There are, of course, other providers such as Google compute engine (Google), Windows Azure (Microsoft), and SoftLayer (which appears to be an IBM company).
A number of years ago, at an OU postgrad event, I overheard a discussion between two IT professionals that began with the question, ‘so, what are the latest developments in servers?’ The reply was something about server consolidation: putting multiple services on a single machine, so you can use that one machine (a physical computer or server) more efficiently. This could be achieved by using virtual machines, but you can only do so much with virtual machines. What happens if you run out of processing power? You need to either get a faster machine, or move one of your virtual machines to another machine that might be under-utilised.
The next generation data centre will be multi-tenant (which means multiple customers or organisations using the same hardware), have mixed workloads (I don't really know what this means), and have shared infrastructure. A key aspect is that an infrastructure can become software defined, as opposed to hardware defined, and the capacity of a cloud configuration or setup can change depending upon local demand.
There were a number of attributes of cloud systems. I think there were: agility, predictability, scalability and automation.
In the cloud world applications can span many virtual machines, and data can be stored in scalable databases that are structured in many tiers. The components (that make up a cloud installation) can be configured and managed through sets of predefined interfaces (or APIs). I also made a note of a mobile app that can be used to manage certain OpenStack clouds. One example of this is the Cloud mobile app from Rackspace.
Another interesting quote was, ‘[the] datacentre is one big computer and OpenStack is the operating system’. Combining servers together has potential benefits in terms of power consumption, cooling and the server footprint.
One thing that developers need to bear in mind is how to create applications. Another point was: consider scalability and plan for failure. A big challenge lies with uncovering and deciphering what all the options are. Should you use, for example, block storage services, or object storage? What are the relative advantages and disadvantages of each?
Parts of this presentation started to demystify some of the terms that have baffled me from the start. Cinder was, for example, is OpenStack’s block storage. Looking outwards from the operating system, a block storage device could be a hard disk, or a USB drive. Cinder, in effect, mimics what a hard drive looks at, and you can store stuff to a Cinder service as if it was a disk drive. Swift is an object database where you can store object. So, you might think of it in terms of sets of directories, the contents of which are replicated over different hard drives to ensure resilience and redundancy.
There is a difference between a service that is an abstraction to store and work with data, and how physical data is actually stored. To make these components work with actual devices, there are a range of different plug-ins.
I have to admit that I found this presentation thoroughly baffling. I had no idea what was being presented until I finally picked up on the word ‘firewall’, and the penny dropped: if a system architecture is defined in software, the notion of a firewall as a physical device suddenly becomes very old fashioned, if not a little bit quaint.
In the cloud world, it’s possible to have something a ‘software firewall’. A term that I noted down was ‘software defined security’. Through SDS, you can define what traffic is permissible between nodes and what isn’t, but in the ‘real world’ of physical servers, I’m assuming that physical ‘top layer’ firewalls are important too.
I also came across two new terms (or metaphors) that seem to make a bit of sense in the ‘cloud world’. Data could, for example, move in a north-south direction, meaning it goes up and down through various layers. If you’ve got east-west movement of data, it means you’re dealing with a situation where you might have a number of different virtual machines (that might have been created to respond to end user demand), which may share data between each other. The question is: how do you maintain security when the nature of a configuration might dynamically change?
Another dimension to security which crossed my mind was the need for auditability and disaster recovery, and both were subjects that were touched upon by other presenters.
In essence, I understood vArmour to be a commercial software defined security product that works akin to a firewall that can be used within a cloud system.
Presentation: The search for the cloud’s ‘God Particle’
Chris Jackson, who works for Rackspace (a company which has the tagline ‘the open cloud company’), gave the final presentation before we all broke for lunch. Chris confessed to being a physicist (as well as a geek) and referred to research at CERN to find ‘the God particle’. I also seem to remember him mentioning that CloudStack was used by CERN; there’s an interesting superuser case study (OpenStack website), for those who might be interested.
Here’s the question: if there is a theory that can describe the nature of matter, is there a theory that might explain why a cloud solution might not be adopted? (He admitted that this was a bit of fun!) He presented three different theories and asked us to vote on which were, perhaps, the most significant.
The first was: application. Some applications can be rather fragile, and might need a lot of cosseting, whereas other forms of application might be very robust; they’re all different. Cloud applications, it is argued, embrace chaos and build failure into applications. Perhaps the precise character of certain applications might not lend it to being a cloud application?
Theory two: integration. There could be the challenge of integration and connection with existing systems, which might themselves have different characteristics.
The third theory is all about operations. This is more about the culture of an organisation.
So, which theory is the reason why organisations don’t adopt a cloud solution? The answer is: quite possibly all of them.