Automation is the Cloud
Posted by Brandon Burton on October 6th, 2009 filed in technologyThis post was written for and originally appears on my employer’s blog
There has been a lot of buzz, press coverage, and product offerings over the last year that all have something about “The Cloud” in them.
I think it is important that as technologists and sysadmins, we do what we can to bring clarity to what “The Cloud” is and how it affects and benefits you.
This is my take on what “The Cloud” is and my attempt to bring some clarity to the discussion.
Automation is the underlying foundation of the revolution (some say disruption) in Information Technology that is currently happening and has been branded as cloud computing.
Automation is defined as 1:
the technique, method, or system of operating or controlling a process by highly automatic means, as by electronic devices, reducing human intervention to a minimum.
Within IT specifically, this is applied as the development of in-house tools or the adoption of commercial tools that automate processes which were traditionally done manually.
This automation coupled with recently developments in technologies such as virtualization, have allowed for infrastructure to be abstracted from physical hardware, storage, and network ties and be managed in a highly efficient and scalable manner.
The application of automation takes place at all levels in the infrastructure stack.
Let’s take a look at how automation is realized at some of the levels within the infrastructure stack and highlight some specific tools used for automation at each level.
Automation of…Virtualization
Virtualization has been a cornerstone of the cloud computing revolution. Its introduction and maturation has allowed for server resource creation and management to be abstracted from ties to physical hardware, storage, and network resources. The maturation of products like VMware’s ESX Server and the greater vSphere ecosystem, Citrix XenServer, and Microsoft’s Hyper-V have introduced both features and toolsets that allow you to be able to quickly and easily distribute your servers across a pool with resources, but with the flexibility to re-distribute them based on overall resource utilization, enable automatic failover of servers to new physical resources in the event of hardware failure, and to consolidate workloads for greater efficiency and use of physical resources.
Without the development of the abstraction that virtualization provides and the creation of tools to automate the usage of the advanced features and capabilities, such as intelligent resource pool and automated failover, the emergence of changes in infrastructure practices would not have easily come about.
Automation of…Networking
The realm of networking is one of the last to become fully virtualized and automated. The major virtualization platforms all have virtualized networking within the hypervisor and within the pools or clusters you can build from them. But this has always been fragile at best.
Your switch fabric, VLANs, routers, and such have all needed to be configured to meet every possibility that may exist as you move around, grow, or shrink existing workloads and create new ones. This area has been partially automated at best and such automated is usually “duct tape and bailing twine.”
Cisco’s new Unified Computing System and specifically their new Nexus line of switches have changed this, as they built deep integration into the switch and routing fabric, though this integration is limited to VMware vSphere solutions at this point in time. This deep integration enables easier automation at higher levels within the stack.
Automation of…Operating System Deployments
So we have a a virtualized pool of resources and network resources which are adapting to this idea of cloud computing. But to utilize this we need to have Operating Systems on which we can run our workloads. If we can’t quickly provision new operating systems or refresh existing ones, and do it in a highly automated fashion, then we can’t achieve the the elasticity in our computing that cloud computing promises.
A number of existing deployment tool chains are being adapted to enable automation within the cloud at this layer. The range from tools like SystemImager (for Unix like operating systems) and Windows Deployment Services (for operating systems developed in Redmond) which focus on deployment from a catalog of pre-configured and templatized “golden images” to solutions like Cobbler (which managed Kickstart, PXE, DHCP, and DNS for Redhat derived Linux distributions), FAI (for Debian derived Linux distributions), or Kickseed (which bridges the Kickstart and Ubuntu worlds) which focus on a “built from source” approach where the minimal OS install is handled in real time (every time) and additional configuration is done through “post install” scripts or handed off to a configuration management solutions. These tools all have api/automation hooks to one degree or another or are highly scriptable, and so are a crucial piece in the automation stack.
Automation of…Configuration Management
Once we have a working base operating system, we need to populate it with the environment our workload will run in. This area, known as (server) configuration management is a “hot topic” these days. A number of startups are based solely around open source config management solutions and I think most cutting edge sysadmins would agree that this is where the buzz is.
Configuration management tools provide you with the ability to “declare” the desired state of your server environment in a terse and logical manner and they automate away the particulars of how this achieved, so you don’t have to worry about the mechanics of adding a user on CentOS vs. Solaris or installing your LAMP or Rails stack on various Linux distributions.
The most talked about tools in this arena are Puppet and Chef, which BCFG2 and CFENGINE being very widely used, but considered “old hat” by many. There are other less well known tools, such as the very Rails focused Sprinkle, which amazingly, has an experimental Windows port being developed.
Unfortunately for you Windows admins, pretty much all of these tools are built for Linux/*nix like operating systems and that is where their development appears to be heavily focused for the foreseeable future.
As you may have noticed, the popular tools in this area tend to be written in Ruby.
All in all, this layer in the stack is very hot and is easily the most vibrant and highly automated at this time.
Automation of…Application Deployment
A little less “hot”, but nearly as heavily automated is application deployment. This is the tool in your toolset that handles deploying your application to the environment you’ve previously built and configured. Typically this tool will do things such as:
- Check out your application from your revision control system
- Do any setup/data population to boot strap your application
- Perform the necessary config changes to allow your application to run
- Perform some series of automated tests to confirm your application is ready to be put into production
Tools to accomplish this include Capistrano (Ruby), Vlad the Deployer (Ruby), Fabric (Python), and ControlTier (Java). With the exception of the last one, these tools run on Linux/*nix and at best have experimental supports for Windows via cygwin. ControlTier appears to be a first class citizen in both worlds and has my interest because of this.
This is, again, a very highly automated layer and is at the forefront of the automation that is driving cloud computing.
Automation of…Everything Else/The Whole Thing/Scripting
Given that no part of the stack is completely automated and some tools bleed over into multiple layers, and that of course, not all the parts of your infrastructure will have tools developed for it. There exists a need to do your own scripting/automation.
There exists, of course, the option of building your own tools using scripting languages such as Ruby, Python, Perl, and Powershell.
There are also libraries that try to wrap and automate parts of the stack, often with a focus on the various public cloud computing offerings. These tools include scalr, poolparty, libcloud, and deltacloud. Most of these are done in Ruby or Python. Starting to see a trend?
This is an area ripe for growth and no doubt you’ll find yourself using on these libraries to build automation tools or implementing your own from scratch in one of the languages these existing tools use.
And finally…API!
Ultimately, to truly be cloud computing, you need to be able to consume APIs. That what all this chatter about automation needs to result in. Good automated tools that provide APIs for the end user (be it a sysadmin or developer or even a tool user) to consume. This is why Amazon’s Web Services are hailed as the canonical example of the cloud computing. Because to use it you consume APIs or utilizes toolsets which in turn use those APIs.
If you are working on helping to mold a private cloud infrastructure or are working on building a public cloud computing offering, open source or commercial, remember the mantra of “Automation! API!”
I hope this has given you some insight into what lies behind the cloud and some directions in which to learn more about how to you can tap into or build your own cloud computing.
October 6th, 2009 at 11:19 am
You should check out Nolio’s Automation solution for the application and “everything else”.
October 6th, 2009 at 12:12 pm
Over on the lopsa-discuss mailing list, a fellow sysadmin pointed out a solution for deploying Solaris, http://www.sun.com/bigadmin/content/jet/
October 6th, 2009 at 5:43 pm
Mate, we are doing something similar with continuous integration over at Atlassian; where we have coupled virtualisation with automated deployments for the product teams so that their software can be tested on all the combinations of supported software and operating systems.
I like the writeup as it gives people a good overview of what to expect and what tools are available to get the job done.
I have to concur with the lack of configuration management for Win32 environments. This is compounded by most software not having a well documented method of performing silent/unattended installations.
Good stuff.
October 6th, 2009 at 6:25 pm
Thanks for the comment. You guys make excellent tools, a number of our customers, including the guys behind the Dojo Toolkit use them, to great success.
Cheers.
October 6th, 2009 at 6:28 pm
Also, my friend and fellow LOPSA.org member, Luke Crawford, posted a great response over at http://wiki.xen.prgmr.com/xenophobia/2009/10/automation-is-the-cloud.html
October 12th, 2009 at 11:25 pm
Hi,
Nice write up. When talking about development automation there is also Continouos integration concept and tools.(http://en.wikipedia.org/wiki/Continuous_integration)
October 25th, 2009 at 6:43 am
[...] of a Sysadmin wrote a great article about “the cloud”. It’s called “Automation is the cloud“, and it starts with a very laudable, though difficult goal. I think it is important that as [...]
October 29th, 2009 at 4:54 am
Great article! I’m currently writing a tutorial series on Puppet for those who want to get started with Linux automation but aren’t sure how:
http://bitfieldconsulting.com/puppet-tutorial
Your post is a great summary of reasons why us sysadmins need to learn this stuff.
November 1st, 2009 at 10:21 pm
[...] a number of interesting things rolling around in my head, including: * A couple follow ups to my Automation is the Cloud post * Micro-benchmarking light weight webservers * Using Varnish in production * A number of other [...]