Dr. Dobb's Digest October 2009
Dana Moore is a Division Scientist at BBN Technologies
Cloud computing is coming of age. It provides virtually unlimited computing resources in a wide variety of offerings. It offers new business models as to how to leverage, package, and access computing resources. An actual example illustrates some of the cloud's potential. Franz Inc, makers of advanced Semantic Web tools, wanted to test its software product across dozens of machines and databases. The problem was that Franz didn't have machines or the software and obtaining them was cost prohibitive -- in excess of $100,000 -- not to mention the delays in acquisition and configuration. However, the test was critical. It would demonstrate the product's value and provide a credible differentiation from the competition. What to do? Franz opted for a cloud solution, staging its service on Amazon's S3. The company ran its complex tests in a few days, and gained a wealth of insights and performance data to evaluate -- and at a cost of less than $200. For Franz, cloud computing changed the competitive analysis and understanding of performance in the large at a fraction of the cost and time.
Cloud computing is nothing more nor nothing less than network-accessible, on-demand computing resources. The computing resources can range from low-level resources such as memory and disk space to high-level resources such as access to an Oracle database, or an entire CRM system, or offloading an enterprise payroll service. Gartner sees this as two major divisions -- the low end of cloud computing focuses on resources with virutalized systems, while the high end focuses on large scale services or Software-as-a-Service (SAAS). They are distinct, but both part of cloud computing. The resources, selected based on your needs, are available on-demand with charges based on actual usage. You pay for what you use.
Cloud computing is very much the opposite of most modern enterprise computing expense model which pay for a set of computer resources based on peak needs with significant latency, and bureaucracy, to increase the number of computer resources. Many enterprise resource stand idle waiting for the new product release or some other peak interaction period. And if your demand estimates are wrong, which most likely they often are, you have unhappy users or unhappy accountants (or both!). Cloud computing dramatically changes the cost profile from peak cost to average cost. It also changes the allocation cycle form weeks and months to minutes and seconds. There are also some dark clouds too. There are currently no cloud standards or, for that matter, long established cloud offerings. Confounding the dilemma, there are dozens of cloud offerings from almost every major computer vendor. Each vendor's offerings are different, some dramatically so. This reflects the nascent nature of cloud computing -- vendors experiment with service offerings, pricing, and the like. You can also be assured that not all will survive. And if that isn't enough, you can even establish your own internal cloud to more efficiently use your existing computing resources. No wonder there is some discomfort with this revolutionary change. But with most discomfort comes opportunity.
Put simply, cloud computing aims to better leverage computing resources from the mundane but critical needs of memory and processing to huge potential of massively networked applications. So where to start? First, let's acknowledge the most frequently encountered comment made by seasoned IT managers, which is "This cloud computing stuff is too ephemeral, too difficult to manage and control for me. I feel that my resources are too much 'out of sight' for me and my staff." This concern is perfectly comprehensible -- if you can no longer walk behind the machine and simply pull the plug; if you can't unplug the machine from the LAN, do you still control it? But consider a crashed disk drive on your desktop PC -- the data are in a sense "tangible." Everything on your drive is sitting right there in front of you, tantalizingly you, taunting you, and yet totally not "accessible." Our comfort with physicality must wane for in many ways it has been eclipsed by the current levels of abstraction and virtualization found in the modern enterprise. The only thing truly novel about cloud computing is how extreme it gets. We may smile at the naivete of the example, the dichotomy between "visible and tangible" versus "remote but accessible" is perfectly human; it's in our hunter/gatherer genes to want resources to be at arm's length. The notion is actually rather ironic though, given the distributed nature of our enterprise data farms and the offsite backup sites we purposely locate far from our primary data centers to assure survivability.
IT managers considering integration with or movement to cloud services divide rather neatly into two intellectual camps, one that normally thinks in terms of software as a service, and the other which thinks in terms of creating platforms as a service. Gartner, Wikipedia, and others discuss these two points of view which we will call "cloud-ists" and "compute-ists".
If you are a "cloud-ist", then you look at the phenomenon through the lens of Software-as-a-Service (SaaS,) and the idea is that anything your IT shop can offer -- anything from emulating a complete operating system to, payroll services, to CRM -- resides on systems "out there" on the net, provisioned and maintained by "someone else." One alias often heard as it applies this point of view is that cloud computing is tantamount to using OPM, in this case meaning other peoples' machines. This camp tends not to think below the level of a specific application or composite application that might be composed (the term "mashed-up" is often heard in this context) by using exposed APIs of specific applications
If you are a "compute-ist", then you see things more or less as existing in the mould of Platform-as-a-Service (PaaS), where PaaS is pretty full scope. Wikipedia lists a host of capabilities under the sobriquet including application design, development, testing, deployment and hosting. Database facilities, and security are also directly exposed in a PaaS implementation, whereas in the SaaS view, they are most frequently not exposed. You can look at Google Application Engine (GAE) as exemplifying the SaaS viewpoint and Amazon Web Services (AWS) and representative of PaaS.
Both SaaS and PaaS expect to interact with the user via lightweight clients (commonly the client is entirely delivered through the browser), but PaaS can exist either inside or external to your own data center, and not necessarily off premise. The PaaS point of view might be more commonly found amongst government and academic IT managers, while the first is probably more common amongst smaller and newer companies, researchers and independent software developers.
If you're a "cloud-ist", you may well have heard of some impressive examples of using OPM. It's well known that the ability to accommodate sudden growth in the demand curve is a critical success factor. At this writing, one of the continuing and acknowledged problems confounding Twitter's market growth is the frequency of service outages caused by demand spikes. Balancing the need for reserve compute capacity with the need to preserve capital has confounded more than one promising startup. A well-known problem for an exciting new offering is the phenomenon of being "dugg" (mentioned on Digg's website) or "slashdotted" (being mentioned on slashdot.org) This creates a wellspring of excitement (generally a good thing) and drives an immediate and inordinate amount of traffic to a site (possibly disastrous). Being able to understand the effect of flash traffic -- and even better to mitigate against -- it is a significant differentiator between two similar services, and one which only a few years ago would have absorbed significant working capital. Observe this at work in the following example which illustrates the importance of understanding how your offering will respond to peaks in user demand, and getting the data nearly free.
Additionally, IT managers from CIO on down do not lightly move from approaches, even those which are far from perfect when there are impacts to budgets to be considered or when a new approach is hard to define quantitatively, or when a concept's impact on revenue or capital preservation is not easily seen. Further, Given the degree to which the term "cloud computing" has been overloaded and re-purposed by various vendors, trying to sort through arguments pro and con may prove difficult. In the following sections, we offer eight considerations that may help offer a well-rounded view.
Understanding how to put networked, on-demand to use; how to stand up your service in the cloud is, we believe, becoming a critical success factor, a next step beyond the last great revolution in application development, the development of SaaS technologies via Agile frameworks (for example, Ruby on Rails, Python TurboGears, Google Java Web Toolkit). We suggest that, as important as using the right frameworks for software development have become in the past five years, understanding and leveraging cloud computing has even more important to market success.
What are some possible cloud scenarios? Where can you get started with maximum benefit and minimum risk? First of all, it is good to get really started. Any solid experience could open up enterprise possibilities. Ignorance in such a critical technology will only help your competitors . And just reading articles and books does not provide your team the hands-on experience needed to truly evaluate the best uses. Here I discuss three possible scenarios to get you started, each increase the risks but also provide greater value.
The easiest place to start is testing. Cloud computing offers a unlimited selection of configurations, distributions, and services that allow effective, exhaustive testing. The dynamic allocation can change the way you view testing and allow many test cases formally not possible. Testing is also a good way to "get your feet wet in the clouds" without incurring any real risk.
The next area that can be quite helpful is surge management. Despite the best predictors of computer use, there are always periods of extreme high use. If these are especially rare, you simply wait it out. Handling some processing in the cloud allows you take advantage of the dynamic resources without total dependence. You can adjust your surge kick-in with your comfort of the technology. This is an especially effective approach for volatile usage patterns. Ancerllary to this, is using the surge approach as a backup in case of failure or assistance to improve overall quality of service.
The final case is rapid prototying and deployment. The resources and services are so easily assembled that you are free to experiment. This could launch a new, lucrative service that was formally cost or time prohibitive. Interesting enough, if success occurs but you feel the risks are too high, you can always fold some of the processing back into your own infrastructure.
One final note: A a cloud does not have to be "out there". Several open source implementations offer your enterprise to form their own cloud. This really isn't that much different technically except you hold all the data and you must manage the cloud. It is also worth considering if your data and processing is highly confidiental. Of course, you can also blend an internal cloud seamlessly with an external one.
A cloud is tough thing to nail down. The various cloud implementations offer quite a bit of variety. Considerations include resource types, access methods for both the cloud occupants (your applications) and the cloud user, privacy and security, quality of service, cost, and vendor lock-in.
First lets see what the cloud has to offer. As stated above, clouds offer a quite a range.
Consider using cloud-provided infrastructure in contrast to the way you currently work. Traditionally, the enterprise has had to build everything from the ground up. Every new service you roll out has forced as much consideration of the logistics of providing for growth in data storage capacity, number of end users, power consumption, and support personnel. The IT poweplant supporting any enterprise service offering has always had to provide for "just in case", for useage spikes, for variance from average. The potential now exists to change the parameters for the enterprise. Unlike tradition setup where you are forced to live and die by having to estimate service provisioning for the high side of the extreme case (forcing you to acquire, maintain, and lifecycle for the peak case, which is often very far from the normative use case), you can essentially plan for instead you are essentially costing for the average, and experiencing recurring costs reflective of that norm. In this sense, cloud architectures are very similar to "just in time" supply chains.
This shift in design and deployment perspective will only work to preserve working capital but also allow scaling to well beyond a budget allowable for peak (you can be very successful and not cross your fingers to see if you can meet the demand generated by a spate of favorable Tweets.) Because of the cloud, you can change the traditional mindset almost completely -- you no longer need worry about peak so as to change your business model such as a rolled out release to allow customers to gradually get the information -- you can meet all your customer demands at once. You can also set an upper limit if concerned about not having an unlimited cost (such as might occur in a Denial-of -Service attack)
Above and beyond this simple assurance of infrastructure, cloud computing offers even more. It is fast becoming a new type of Internet operating system. Operating systems control a range of computing resources to allow applications to work above the fray of disk access and screen updates. Cloud computing offers the computing resources that allow computing solutions to work above the need to allocate machines and control their communications. Amazon cloud service offers simple raw computing power scaled to reflect various sizes of real machines while also offering services that assist these resources such as distributed messaging and data persistence.
Additionally, cloud middlemen can take these raw resources and assemble into larger offerings. Google, for example, offers a set of simple APIs that abstract away so much of the infrastructure that creating an application using its Google Application Engine facilities, feels like creating a simple single machine/local application. Consider how quickly your staff could create applications if they were able to focus 100% of development energy on the business rules of an application rather than spending most of the development budget accomodating the details of what happens when you actually attract a significant user base. This is the promise of GAE; by leveraging an Agile language (Python) and a well known internet application development framework (Django). 15 minutes of exploration will begin to convince you that application development can potentially become a matter of fast iterative, incremental coding style.
Amazon, through its Elastic Compute Cloud, offers an exquisitely tweakable set of capabilities: virtual machines containing processors, memory, and storage. A cloud application can dynamically allocate and deallocate the virtual machines sized from small 32-bit processors to 64-bit multicore processors. All with corresponding memory and storage. If you are just looking for storage capability, Amazon's Simple Storage Service accommodates that too. The storage is direct and simple. An application developer declares buckets and places data objects into the bucket. Each bucket and object maintains a security profile that controls access.
So your first decision is whether these resources are useful for your needs. Are they the right granularity and composition to serve your special needs?
After identifying a match to your resource needs, several other concerns occur. Is the information safe? Is it protected? Again the cloud resources above offer various levels of protection as part of the resource. Many offer protection services analogous and even superior to a well protected enterprise computing center. After all they have a strong interest in not allowing the intentional or accidental damage. In fact they are probably more focused and more expert than many IT staffs. Amazon, for example, allows the encryption of almost everything. Computing resources offer firewalls and other types of protection. Additionally, the distributed nature of the cloud allows a new type of protection often the opposite of enterprise computing. Traditional computing places all the computing resources in one or two highly guarded locations. Cloud computing allows you to place your applications all over the world. Yes, one site may come down but it is unlikely that the entire globe goes down -- and if that happens your applications should be the least of your worries. This dispersion can also better serve your users from where they are and not where your servers are.
Now that you have the right resources and you are satisfied with their protection, now you must interface with their resources. And there is the current rub. Each vendor's interface is unique to their offering. Tools and standards are elvoing to help insulate your solutions from the actual implementations but they are not there yet. The Open Cloud Consortium is a set of universities and businesses working on cloud standards, interoperability standards, and benchmarks. Right now you must march to the non-standard or proprietary interface methods. This need not stop you. Successful cloud offerings may become the standard or at least a market for tools that align the standard to the offering. But of course, that depends on you picking the winners which history has shown is not always clear.
Cost and Quality of Service (QoS) vary greatly due to the many cloud implementations. Some offering don't even go this far some detail these issues. In fact some are currently free. The importance of cost and QoS depend greatly on your particular cloud use. It depends on how coupled your business is to a particular cloud offering. Testing for example does not have the same dependence as does actually running your business on the cloud.
Lastly consider how you are taking advantage of the cloud capabilities. Do you leverage the dynamics of allocating resources and moving them to where they are needed? Have you modularized your solution so as to make these movements efficient? Have you designed to minimize your costs given the various use charges? Will the quality of service meet your customer's expectations? Will the cloud implementation continue to advance and offer support?
Although the effect known as the shock of the new often tends to make managers uncomfortable, the momentum behind cloud computing is unquestionable and its impact in the enterprise seems inevitable but as yet, not well qualified. Consider what you already do know though and how to consider your position going forward:
Cloud computing offers a rich, evolving variety of on-demand resources. Already the same forces that worked to open source operating systems
The change the cost model from peak to average use and the time model from months to minutes. Cloud computing is in its infancy. Standards are lacking. The vendors may drop or change their offerings. But do not let you sit on the side lines. There are many scenarios that would help your team get familiar with cloud potential while addressing significant problems in testing and demand management. Despite an initial concern with the rapid pace of the cloud revolution, it holds the key to better use of your data, processing, and resources. Embracing this specific change and harnessing it faster and better than your competition will put you out in front of the bow wave of adopters and establish you as an industry innovator.