OU blog

Personal Blogs

Christopher Douce

OpenStack conference, June 2014 (part 2 of 2)

Visible to anyone in the world
Edited by Christopher Douce, Saturday, 7 June 2014, 13:21

This blog post is the second of two that summarises an OpenStack conference that I attended on 4 June in London. 

This second half of the conference had two parallel sessions.  Delegates could either go to the stream that was intended for novices (which is what I did), or go to a more technical session. 

I was quite tempted by the technical session, especially by a presentation that was all about what it means to be an OpenStack developer.  One of the key points that I did pick up on was that you need to know the Python language to be an OpenStack developer, which is a language that is used within the OU’s data structures and algorithms module, M269 Algorithms, data structures and computability

Introduction to OpenStack

The first session of the afternoon was by Kevin Jackson who works at Rackspace.

Kevin said that OpenStack and Linux are sometimes spoken about in similar terms.  Both can be created from distributions, and both are supported by companies that can offer consultancy support and help to move products forward. ‘OpenStack is like a pile of nuts’, said Kevin, and the nuts represent different components.

So, what are the nuts?  Nova is a compute engine, which hosts a virtual machine running in a Hypervisor.  I now understand that a hypervisor can host one or more virtual machine.  You might have a web server and your application code running within this bit of OpenStack.

Neutron is all about networking.  In some respects, Neutron is a virtual network that has been all written in code.  There is more about this in later presentations.  If you have different parts of an OpenStack implementation, Neutron allows the different bits to talk to each other; it pretends to be a physical network.

Swift is an object store, which is something that was spoken about during an earlier presentation.  Despite my earlier description, Swift isn’t really like a traditional file system.  Apparently, it can be ‘rack or cabinet aware’, to take account of the design of your physical data centre.

Cinder is another kind of storage; block storage.  As mentioned earlier, to all intents and purposes, Cinder looks like a ‘drive’ to a virtual machine.  I understand a situation where you might have multiple virtual machines accessing the same block storage device.

Ceilometer is a component that was described as telemetry.  This is a block which can apparently say how much bandwidth is being used.  (I don’t know how to describe what ‘bandwidth’ is in this instance – does it relate to the network, the available capacity within a VM, or the whole installation?  This is a distinct gap in my understanding).

Heat is all about orchestration.  Heat monitors ‘the cloud’, or its environment.  Kevin said, ‘if it knows all about your environment and suddenly you have two VMs and not three, it creates a third one’. This orchestration piece was described as a recipe for how your system operates.

All these bits and pieces are controlled by a web interface called Horizon, which I assume makes calls to the APIs of each of these components.  You can use Horizon to look at the components of the network, for example.  I have to confess to being a bit confused about the distinction between JuJu and this standard piece of OpenStack – this is another question that I need to ask myself.

At the end of Kevin’s presentation, I’ve made a note of a question from the floor which was: ‘why should I go open source and not go for a proprietary solution?’  The answer was interesting: you can get locked into a vendor if you choose a proprietary solution.  If you use an open source solution, such as OpenStack you can move your ‘cloud’ different providers, say, from Rackspace to HP.  With Amazon web services, you’re stuck with using Amazon web services.  In some respects, these arguments echo arguments that are given in favour of Linux and other open source products.  The most compelling arguments are, of course, freedom and choice.

A further question was, ‘how mainstream is this going to go?’  The answer was, ‘there’s many companies around the globe who are using OpenStack as a solution’, but I think it was also said that OpenStack is just one of many different solutions that exist.

OpenStack and Storage made easy at Lush Cosmetics

The second presentation of the day was made by Jim Liddle who works for a cloud consultancy called Storage Made Easy.

Jim presented a case study about his work with Lush Cosmetics.  I’ve made note of a number of important requirements: the data that is stored to the cloud should be encrypted, and there should be ways to help facilitate auditing and governance (of the cloud). 

It’s interesting that the subject of governance was explicitly addressed in this case study.  The importance of ‘institutional control’ and the need to carry out checks and balances is one of reasons why organisations might choose private clouds over public clouds. In the case of Lush, two key drivers included the cost per user, and the need to keep their data within the UK.

A new TLA that I heard was OVF (Wikipedia), an abbreviation for Open Virtualization Format, and is a way to package virtual machines in a way that is not tied to any particular hypervisor (VM container), or architecture.  Other technologies and terms that were referred to included: MySQL, which is touched on in TT284 Web Technologies (OU), Apache, MemCached (Wikipedia) and CentOS.

Deploying Windows Workloads into OpenStack using JuJu

A lot of the presentations had a strong Linux flavour to them.  Linux, of course, isn’t the only platform that can be used to power clouds. Alessandro Pilotti from Cloudbase solutions spoke on the connections between Windows and OpenStack.

Terms that cropped up during his presentation included Hyper-V (a hypervisor from Microsoft), KVM (Kernel based virtual machine, which is Linux hypervisor), MaaS (metal as a service, an Ubuntu term), GRE Tunnels (GRE being an abbreviation for Generic Routing Encapsulation), NVGRE (Network Virtualization using Generic Routing Encapsulation), and RDP (Remote Desktop Protocol).

It was all pretty complicated, and even though I’m reasonably technical, this was at a whole other level of detail.  Clicking through some of the above links soon takes me into a world of networking and products that are pretty new to me.  This clearly suggests that there is a whole lot of ‘new technology’ out there that I need to try to make a bit of sense of, and this, of course, takes time.

Alessandro also treated us to a live video link that showed a set of four small computers that were all hooked up together (I have no idea what these small desktop computers without screens were called; they used to have a special name).  The idea was to show LEDs flashing to demonstrate some remote rebooting going on.

This demo didn’t quite work out as planned, but it did get me thinking: to really learn how to do cloud stuff, a good idea would be to spend time actually playing with bits of physical hardware. This way you can understand the relationships between logical and physical architectures.  The challenge, of course, is finding the time to get the kit together, and to do the learning.

Using Swift in Entertaining Ways

This presentation was made by a number of people from Sohonet a company that offers cloud services to the film and TV industry.  An interesting application of cloud computing is film and video post-production, the part of production where when recordings are digitally edited and manipulated. An interesting challenge is that when it comes to video post-production we’re talking about huge quantities of data, and data that needs to be kept safe and secure.

Sohonet operates two clusters that are geographically separate.  Data needs to be held over different timescales, i.e. short, medium and long-term, depending upon the needs of a particular project.

A number of interesting products and companies were mentioned during this talk.  These include Expandrive where an OpenStack Swift component can become a network drive.  Panzura was mentioned in terms of Swift as a network appliance.  Zmanda and Cloudberrylab were all about backup and recovery.  Interesting stuff; again, a lot to take in.

Bridges and Tunnels – a drive through OpenStack networking

Mark McClain from the OpenStack foundation, talked about the networking side of things, specifically, the OpenStack networking component that is called Neutron.  Even though I didn’t understand all of it, I really enjoyed this presentation.  On a technical level, it was very dense; it contained a lot of detail.

Mark spoke about some of the challenges of using the cloud.  These included a high density of servers, the difficulties of scaling and the need for on-demand services.  A way to tackle some of these challenges is to use network virtualisation and something called overlay tunnelling (but I’m not quite sure what that means!)

Not only can virtual machines talk to virtual drives (such as the block storage service, Cinder), but they can also talk to a virtual network.  The design goals of the network component were to have a small core, and to have a pluggable open architecture which is configurable and extensible.  You can have DHCP configuration agents and can specify network traffic rules.  Neutron is also (apparently) backed by a database and a message queue.  (I also heard that there is a REST interface, if I’ve understood it correctly and my notes haven’t been mangled in the rush to write everything down).

A lot of network hardware can now be encoded within software (which links back nicely to the point about abstraction that I mentioned in the first block).  One example is something called Openvswitch (project website).  I’ve also noted down that you can have a load balancer as a service, a VPN as a service and a firewall as a service (as highlighted by the earlier vArmour talk).

Hybrid cloud workloads

The final talk of the day was by Monty Taylor who is also from the OpenStack foundation.  A hybrid cloud is a cloud that is a combination of public and private clouds (which could, arguably be termed an ‘ecosystem of clouds’).  Since it was the end of the day, my brain was full, and I was unable to take a lot more on board.

Reflections

All this was pretty interesting and overwhelming stuff.  I remember one delegate saying, ‘this is all very good, but it’s all those stupid names that confuse me’.  I certainly understand where he was coming from, but when it comes to talking about technical stuff, the names are pretty important: they allow developers to share understandings.  I’m thankful for those names, although each name does take quite a bit of remembering.

One of the first things I did after the conference was to go look on YouTube.  I thought, ‘there’s got to be some videos that helps me to get a bit more of an understanding of everything’, and I wasn’t to be disappointed – there are loads.  Moving forward, I need to find some time to look through some of these.

One of the things that I’ll be looking for (and something that I would have liked to see in the conference) was a little bit more detail about case studies that explicitly show how parts of the architecture work.  We were told that virtual machines can spin up in situations where we need to attend to more demand, but perhaps the detail of the case studies or explanations passed me by.

This is a really important point.  Some aspects of software development are changing.  I’ve always held the view that good software developers need to have an appreciation of system administration (or the ‘operations’ side of things).  When I had a job in industry there was always a separation between the systems administrators and the developers.  When the developers are done, they throw software over the wall to the admins who deploy the software.

This conference introduced me to a new term: a devop – part developer, part programmer.  Devops need to know systems stuff and programming stuff.  This is a reflection of software being used at new levels of abstraction: we now have concepts such as network as a service, and software defined security.  Cloud developers (and those who are responsible for keeping clouds running) are system software developers, but they can also be (and have to understand) application development too. 

A devop needs a very wide skill set: they need to know about networks, hardware, operating systems, and different types of data store.  They might also need to know about a range of different scripting languages, and other languages such as Python.  All these skills take time (and effort) to acquire.  A devop is a tough and challenging job, not only due to the inherent complexity of different components, but also due to the speed that technologies change and evolve.

When I arrived at the conference, I knew next to nothing about what OpenStack was all about, and who was using it.  By the end of the conference I ended up knowing the names of some of its really important components; mists of confusion had started to lift.  There is, however, a huge amount of detail to get my head around, and one of the things that I’m also going to do is to look at some user stories (OpenStack foundation).  This, I think, will help to consolidate some of my learning.

Permalink Add your comment
Share post
Christopher Douce

OpenStack conference, June 2014 (part 1 of 2)

Visible to anyone in the world
Edited by Christopher Douce, Friday, 6 June 2014, 17:43

On 4 June, I went to an event that was all about something called OpenStack.  OpenStack is an open source software framework that is used to create cloud computing systems.  The main purpose of this blog is to share my notes with some of my colleagues, but also to some of the people who I met during the conference.  Plus, it might well be of interest to others too.

Cloud computing is, as far as I understand it, a broad terms that relates to the consumption and use of computing resources over a network.  There are a couple of different types of cloud: there are public clouds (which are run by large companies such as Amazon and Google), private clouds (which are run by a single organisation), and hybrid clouds (which is a combination of public and private clouds).  There’s also the concept of a community cloud - this is where different organisations come together and share a cloud, or resources that are delivered through a cloud.

This is all very well, but what kind of computing resources are we talking about?  As far as I know, there are a couple.  There’s software as a service (or SaaS).  There’s PaaS, meaning, Platform as a Service, and there’s IaaS, which is Infrastructure as a Service.  Software as a Service is where you offer software through a web page, and you don’t ever touch the application code underneath.  Infrastructure as a Service is where you might be able to manage a series of ‘computers’ or servers remotely though the cloud.  More often than not, these computers are running in something called virtual machines.

These concepts were pretty much prerequisites for understanding what on earth everyone was talking about during the day.  I also picked up on a whole bunch of new terms that were new to me, and I’ll mention these as I go.

Opening Keynote : The OpenStack Foundation

Mark Collier opened the conference.  Mark works for the OpenStack Foundation (OpenStack website). During his keynote he introduced us some of the parts that make up OpenStack (a storage part, a compute part and a networking part), and said that there is a new software release every six months.  To date there are in the order of approximately 1.2k developers.  The community was said to comprise of approximately 350 companies (such as RedHat, IBM, HP, RackSpace) and 16k individual members.

Mark asked the question: ‘what are we trying to solve?’  He then went onto quote Mark Andreessen who said, ‘software is eating the world’.  Software, Mark said, is said to be transforming the economy and disrupting industries. 

One of the most important tools in computer science is abstraction.  OpenStack represents a way to create a software defined data centre (a whole new level of abstraction), which allows you to engineer flexibility to enable organisations to move faster and software systems to scale more quickly.

Mark mentioned a range of different companies who are using OpenStack.  These could be considered to be superusers (and there’s a corresponding superuser page on the OpenStack website which presents a range of different case studies).  Superusers include organisations such as Sony, Disney and Bloomberg, for example.

I remember that Mark said that OpenStack is a combination of open source software and cloud computing.  Another link that I noted down was to something called the OpenStack marketplace (OpenStack website).  Looking on this website shows a whole range of different Cloud distributions (many of which come from companies that offer Linux distributions).

Keynote: Canonical, Ubuntu and OpenStack

Mark Shuttleworth from Canonical (Canonical website) offered an industry perspective.  Canonical develops and supports Ubuntu which is a widely used Linux distribution.  (It is used, as far as I can remember in the TM129 Technologies in Practice module).  As well as running on the desktop, Ubuntu is widely used on the server side, running within data centres.  A statistic I’ve noted down is that Ubuntu accounts for ‘70% of guest workloads’.  What this means is that we’re talking about instances of the Linux operating system that have been configured and packaged by Ubuntu (that are running on a server within a datacentre, somewhere).

A competitor to Ubuntu is another Linux distribution called CentOS.  There is, of course, also Microsoft Windows Server.  When you use public cloud networks, such as those provided by Amazon, I understand that you’re offered a choice of the operating system that you want to ‘host’ or run.

An interesting quote is, ‘building your cloud is a bit like building your own mainframe – users will always want it to be working’.  We also heard of something called OpenStack Interoperability Laboratory.  Clouds can be built hundreds of times a day, we were told – with different combinations of technology from different vendors.  ‘Iteration is the only way to understand the optimal architecture for your use case’.

A really important aspect of cloud computing is the way that a configuration can dynamically adapt to changing circumstances (and user demands).  The term for how this is achieved (in the cloud computing world) seems to be ‘orchestration’.  In OpenStack, there is a tool called JuJu (Wikipedia).  JuJu enables (through a dashboard interface) different combinations of components to be defined.  There is a concept of a ‘charm’ (which was described as scripts which contain some operational coding).  If you would like to look at what it is all about, there’s a website called JuJu Charms that I’ve yet to spend time exploring.

I’ve also noted down something called a Service Orchestration Framework, which lets you place services where you want, and on what services.  There are some reference installations for certain types of cloud installations (which reminds me of the idea of ‘design patterns’ in software).

Mark referred to a range of different technologies during his talk, some of which I had only very briefly heard of.  One technology that was referred to time and time again was the concept of the hypervisor (Wikipedia).  I understand this to be a container (either hardware or software) that runs one or more virtual machines.  Other terms that he mentioned or introduced include KVM (Kernel-based virtual machine), Ceph (a way to offer shared storage), and MaaS, or Metal as a Service (Ubuntu), which ‘brings the language of the cloud to physical servers’.

A further bunch of mind boggling technical terms that were mentioned include ‘lightweight hyppervisors’ such as LXC (LinuX Containers), Hadoop, which is a data storage framework, and TOSCA (Wikipedia), which is an abbreviation for Topology and Orchestration Specification for Cloud Applications.  In terms of databases, some new (and NoSQL) technologies that were mentioned included MongoDB and Cassandra.

At this point, it struck me how much technologies have changed in such an incredibly short time, reminding me that we live in interesting times.

Keynote: Agile infrastructure built in OpenStack

The second keynote of the day was by John Griffith, Project Technical Lead, SolidFire.  John’s presentation had the compelling subtitle: ‘building the next generation data centre with OpenStack’.

A lot of people started using Amazon, who I understand to be the most successful public cloud provider, to use IT resources more efficiently.  There are, of course, other providers such as Google compute engine (Google), Windows Azure (Microsoft), and SoftLayer (which appears to be an IBM company).

A number of years ago, at an OU postgrad event, I overheard a discussion between two IT professionals that began with the question, ‘so, what are the latest developments in servers?’  The reply was something about server consolidation: putting multiple services on a single machine, so you can use that one machine (a physical computer or server) more efficiently.  This could be achieved by using virtual machines, but you can only do so much with virtual machines.  What happens if you run out of processing power?  You need to either get a faster machine, or move one of your virtual machines to another machine that might be under-utilised.

The next generation data centre will be multi-tenant (which means multiple customers or organisations using the same hardware), have mixed workloads (I don't really know what this means), and have shared infrastructure.  A key aspect is that an infrastructure can become software defined, as opposed to hardware defined, and the capacity of a cloud configuration or setup can change depending upon local demand.

There were a number of attributes of cloud systems.  I think there were: agility, predictability, scalability and automation.

In the cloud world applications can span many virtual machines, and data can be stored in scalable databases that are structured in many tiers.  The components (that make up a cloud installation) can be configured and managed through sets of predefined interfaces (or APIs).  I also made a note of a mobile app that can be used to manage certain OpenStack clouds.  One example of this is the Cloud mobile app from Rackspace.

Another interesting quote was, ‘[the] datacentre is one big computer and OpenStack is the operating system’.  Combining servers together has potential benefits in terms of power consumption, cooling and the server footprint.

One thing that developers need to bear in mind is how to create applications.  Another point was: consider scalability and plan for failure.  A big challenge lies with uncovering and deciphering what all the options are.  Should you use, for example, block storage services, or object storage?  What are the relative advantages and disadvantages of each?

Parts of this presentation started to demystify some of the terms that have baffled me from the start.  Cinder was, for example, is OpenStack’s block storage.  Looking outwards from the operating system, a block storage device could be a hard disk, or a USB drive.  Cinder, in effect, mimics what a hard drive looks at, and you can store stuff to a Cinder service as if it was a disk drive.  Swift is an object database where you can store object.  So, you might think of it in terms of sets of directories, the contents of which are replicated over different hard drives to ensure resilience and redundancy.

There is a difference between a service that is an abstraction to store and work with data, and how physical data is actually stored.  To make these components work with actual devices, there are a range of different plug-ins.

Presentation: vArmour

I have to admit that I found this presentation thoroughly baffling.  I had no idea what was being presented until I finally picked up on the word ‘firewall’, and the penny dropped: if a system architecture is defined in software, the notion of a firewall as a physical device suddenly becomes very old fashioned, if not a little bit quaint.

In the cloud world, it’s possible to have something a ‘software firewall’.  A term that I noted down was ‘software defined security’.  Through SDS, you can define what traffic is permissible between nodes and what isn’t, but in the ‘real world’ of physical servers, I’m assuming that physical ‘top layer’ firewalls are important too.

I also came across two new terms (or metaphors) that seem to make a bit of sense in the ‘cloud world’.  Data could, for example, move in a north-south direction, meaning it goes up and down through various layers.  If you’ve got east-west movement of data, it means you’re dealing with a situation where you might have a number of different virtual machines (that might have been created to respond to end user demand), which may share data between each other.  The question is: how do you maintain security when the nature of a configuration might dynamically change? 

Another dimension to security which crossed my mind was the need for auditability and disaster recovery, and both were subjects that were touched upon by other presenters.

In essence, I understood vArmour to be a commercial software defined security product that works akin to a firewall that can be used within a cloud system.

Presentation: The search for the cloud’s ‘God Particle’

Chris Jackson, who works for Rackspace (a company which has the tagline ‘the open cloud company’), gave the final presentation before we all broke for lunch.  Chris confessed to being a physicist (as well as a geek) and referred to research at CERN to find ‘the God particle’.  I also seem to remember him mentioning that CloudStack was used by CERN; there’s an interesting superuser case study (OpenStack website), for those who might be interested.

Here’s the question: if there is a theory that can describe the nature of matter, is there a theory that might explain why a cloud solution might not be adopted?  (He admitted that this was a bit of fun!)  He presented three different theories and asked us to vote on which were, perhaps, the most significant.

The first was: application.  Some applications can be rather fragile, and might need a lot of cosseting, whereas other forms of application might be very robust; they’re all different.  Cloud applications, it is argued, embrace chaos and build failure into applications.  Perhaps the precise character of certain applications might not lend it to being a cloud application?

Theory two: integration.  There could be the challenge of integration and connection with existing systems, which might themselves have different characteristics. 

The third theory is all about operations.  This is more about the culture of an organisation.

So, which theory is the reason why organisations don’t adopt a cloud solution?  The answer is: quite possibly all of them.

Permalink Add your comment
Share post

This blog might contain posts that are only visible to logged-in users, or where only logged-in users can comment. If you have an account on the system, please log in for full access.

Total visits to this blog: 2308323