Navigating the System Virtualization Maze -- Part 2
Peter Baer Galvin
The virtualization maze is not getting any easier for systems managers, as there are options aplenty and plenty of marketing to go along with them. Last month, in part 1 of this column, I provided the first half of my map through the virtualization maze. It included discussion of the benefits of virtualization, the various types of system virtualization, and the data you need to make an informed virtualization technology decision.
As a brief review, the questions that I suggested should be answered before any virtualization decisions can be made fall into several broad categories. Here I expand the questions with information on why they are important to your overall virtualization strategy.
- What problem are you trying to solve with system virtualization? As with all projects, make sure you know the problem and that the project solves the problem. (This is obvious, except when it is forgotten.) -- Virtualization does not solve "all" system problems, and sometimes creates problems. Be sure you end up solving more problems than you create.
- What hardware needs to be virtualized (specific CPU architectures, other components)? -- Hardware requirements can eliminate some virtualization solutions.
- What operating systems need to be virtualized? -- Again, these requirements can limit the solutions available.
- What applications will run within the virtualized environments? -- Application support is one of the biggest areas of risk in system virtualization.
- What are the performance criteria of the current systems and applications? -- Virtualization decreases available resources on the target platform.
- What are the expected future performance requirements (i.e., resource use growth)? -- Future growth could mean today's perfect fit is tomorrow's can't fit.
- Do you have "good" systems administration methods in place today (e.g., patching, backups, security)? -- All forms of system virtualization end up creating more instances of operating systems or applications to administer. This could multiply current administration issues.
- How many systems, operating systems, and applications would be virtualized? -- The smaller the environment, the less worth-while the time, effort, and money needed to virtualize it.
- What is the infrastructure around the virtualization (networking and storage)? -- Concentrating resource use into fewer systems can increase the use of external resources (like networking and storage). Is the infrastructure ready to handle increased load?
- What conversions would be needed between physical and virtualized systems (known as "P to V" conversions)? -- These answers can limit the solution choices.
- Will movement of a virtualized entity back to a physical entity (known as "V to P" conversion) be needed? -- For example, what if you want to stop using a virtualization technology?
- What are the business requirements of the resulting environment (uptime, DR, resource flexibility, supportability, manageability)? -- Without business drivers, funding is difficult to come by. If a project doesn't end up solving business needs, future projects will be more difficult to get funded.
- What are the requirements around managing resources and limiting their use? -- Some virtualization technologies allow very fine-grained resource management and limits, and some don't.
In part 2 of this column, I'll start from these data points and discuss how to analyze that information to determine the best virtualization solution for your environment. I'll also look at the darker aspects of the technologies and their deployment. By the end of this column, you should have a complete picture of the virtualization options, the choices to be made, and all of the gory details that can get you into or out of the virtualization maze.
Questions and Answers
The questions asked above are all about the problem you are trying to solve via virtualization. The answers to these questions can drive both the types of virtualization that could be brought to bear on your problem and the evaluation criteria for selecting the best virtualization solution. For example, consider this scenario:
"We have 40 older USPARC III-based Sun systems, mostly running home-grown applications and using high amounts of CPU but low amounts of disk and network I/O. We want to reduce the number of systems in our datacenter (and thus reduce power, cooling, and data center space use) and decrease system administration time. Resource growth is 10% per year, we patch systems manually, and if we virtualize these systems we shouldn't need to go back to physical systems. We have no strict uptime or DR requirements."
In this scenario, only one hardware platform (SPARC) and one operating system (Solaris) need to be virtualized. This should be a good fit for using Solaris containers on multiple, fast, modern Sun systems. But further consideration should be given to application support (do they run well in containers) and how the applications would be moved to the new systems. The new systems would have to be sized to handle the current load plus 10% per year for the expected life of the facility.
Another consideration of all virtualizations is downtime. If N systems are virtualized onto one, then all N are affected when the system has to be taken down. Individual operating systems or virtual environments can be rebooted independently, but a systemic failure or scheduled downtime has a much larger impact after virtualization.
Of course, there are many other scenarios, and many possible solutions. But never lose track of the option to not virtualize. As powerful a tool as virtualization is, it is not a tool that solves all problems. Throughout this column are discussions of the pros and cons of virtualization and the issues that can steer your virtualization decisions.
Other Metrics to Consider
Beyond the needs of your site, there are other metrics to weigh when considering a virtualization product. As well as the "usual suspects" of pricing, support, and licensing terms, there are some virtualization-specific aspects.
- The time and effort it takes the technology to provision a new virtual instance (operating system, application).
- Likewise, the time and effort to provision an application within that instance.
- The time and effort it takes to move an application from its physical machine to its virtual environment ("P to V").
- If applicable, the time and effort it takes to move a virtual entity between systems.
- The time it takes to boot, shut down, copy, or replicate a virtual entity.
- The non-monetary costs of virtualization, including training, deployment effort, maintenance effort, monitoring, and management effort not only for the virtualization technology but for the virtualized entities as well.
- The hardware (servers, storage, networking) supported by the virtualization technologies is usually a subset of the hardware supported by the individual applications and operating systems you want to virtualize. Compare what you have to what is supported to minimize hardware costs. The supported resources may also be limited compared to native hardware, such as the number of CPUs a virtual machine can see, or the amount of memory a virtual machine can use.
- The impact on operating system and application licensing for use within the virtualization technology. This aspect could have a major effect on the cost of the virtualization solution. Will you have to pay more for your operating systems and applications than you did when they were on separate systems? For example, an application that is licensed per-CPU could end up requiring all of the CPUs in the system to be licensed, rather than those just dedicated to the application's virtual environment.
The Short List
Once your needs and priorities have been cataloged, you can start considering virtualization technologies to evaluate. Here are some of these technologies, and some broad categories of use:
- VMware -- The leader in x86-based system virtualization.
- Xen -- Open source, multi-vendor initiative that will result in native virtualization being included in the target operating systems (including some Linux releases, and Solaris).
- Solaris Containers -- Solaris-only application environment virtualization.
- Solaris LDoms -- Logical domains allowing UltraSPARC T1-chip based systems to be split into multiple hardware domains, each running its own operating system.
Beyond the short lists there are many other options worth considering, including KVM (an open source solution being added to Linux kernels), Virtual Iron, SWsoft Virtuozzo 3.0, Scalent Virtual Operating Environment 2.0, Altiris Software Virtualization Solution 2.0, and Surgient Virtual QA/Test Management System 5.0.
Virtualization Planning and Deployment Best Practices
Now that you've evaluated all of the requirements, analyzed the metrics, and chosen one or more virtualization technologies, consider these best-practice recommendations when planning and implementing your deployment.
Contact the vendor of each application being considered for virtualization and determine their stance on licensing and support. Both of these are potentially huge issues. Licensing costs could escalate but support issues are more likely. For example, what if your site encounters a problem with an application running in a virtualized environment. Will the application vendor provide support for the problem? There are two typical scenarios:
1. Ignore the issue -- Many sites, when calling a vendor with a problem, will not mention the virtualization of the application. They hope the issue is not virtualization related, which is usually the case. But if it is virtualization related, what is to be done then? Contacting the virtualization vendor for support is the logical next step.
2. Be prepared for V to P -- Convert the application back from a virtual environment to a physical one and then recreate the problem. For this, your site will need spare hardware available that can be dedicated to applications being debugged. If this is your chosen method, then the V to P features of the virtualization solution are very important.
If virtualization is part of your disaster recovery/business continuance planning, then your network architecture will play a large roll in ease of DR deployment and testing. Consider the pain of getting your users talking to your applications, and your applications talking to each other, if you have different VLANs at your production and DR sites. Much better for DR deployment is to have a trunked VLAN network that allows the same VLANs to be at both sites. Then, at a time of DR testing or failover, all of your systems and applications can keep their current IP addresses rather than having to be renumbered.
Sooner or later you are likely to want to move virtual entities between physical systems. The easiest way to enable that movement is to have the virtual entities on shared storage. So, consider deploying the virtualization using a SAN, NFS, or iSCSI (depending on what the virtualization technology supports and what your infrastructure has) rather than on local disks.
Knowing that virtualization is not "free" in terms of resource use, analyze your current systems and add some overhead for the virtualization when sizing the systems to use as virtualization hosts. Similarly, know the types of resources that the applications being virtualized use. The resource use of virtualization varies greatly not just depending on the technology used but also on the resource in question. I/O virtualization is usually more "expensive" than CPU virtualization, while memory virtualization overhead is moderate. Generally, prime candidates for virtualization are low resource users, followed by CPU, then memory, and finally I/O users. For example, it is common for a company to virtualize its low resource use utility servers and much rarer to virtualize big, busy databases. Generally, the more resources a system or application currently uses, the worse a candidate for virtualization it is.
Learning the theory and implementation details behind the virtualization technologies can be a great help in staying away from problem situations. For example, if the virtualization technology needs to intercept I/O calls from the operating system, re-execute the I/O call to the hardware, and shepherd the I/O results back to the virtualized entity, then the I/O performance will be well below that of native hardware.
When considering virtualizing multiple systems onto one, don't just look at the long-term resource use of the source systems. Those systems could run at low resource use except for brief periods when they use all of the system resources. If those peak resource uses happen all at the same time, putting those systems together onto one could be a resource use disaster. So, measure and analyze not just long-term capacity use and growth but short-term resource use as well.
Consider that many virtualization technologies cause the virtual entities to suffer from "clock drift" -- the time in a virtual environment can vary much more than the time on a non-virtualized machine. If your systems or applications need very accurate time, consider using clock synchronization technologies such as NTP.
Compare the limitations imposed by the virtualization hardware to your site's requirements. Does the virtualization technology support sufficient virtual entity resources (such as CPU, memory, disk, and networking) for your current and future needs?
When deciding whether to virtualize, and how to virtualize, consider not just the day-to-day operations but also the edge cases in your environment. Do you have periodic patch cycles? How will those play out when multiple virtual machines are on one physical machine? How will backups, and the even more important restores, work?
Finally, if your site is concerned with security, consider the security implications of virtualization. Monitoring and firewalling inter-system communication is relatively easy, but what if those "systems" are all on one system and the traffic between those systems no longer moves across the network? How could you firewall systems that in the future would be virtual machines on the same physical machine? Once your environment is virtualized, you not only need to worry about the security of the operating systems, applications, storage, backups, and the like but also about the security of the virtualization technology itself. Consider the ramifications of an intruder breaking into your virtualization management platform.
The Future of Virtualization
In the future, application deployment could move from its current state of time-consuming complexity to a simple plug-and-play model. Some steps in this direction have already been taken. Consider the VMware player and the "virtual appliances" that can run in it. These appliances are pre-built images of installed applications (typically by the company that wrote the application).
Deploying one of these appliances (and thus an application in an appliance) is a simple matter of downloading an image and loading it into VMware. If all application vendors moved to this model, and if there were a universal virtual appliance format to allow appliances to run in the myriad virtualization environments, the world of systems administration would be a much happier place. Applications would be hardware independent and trivial to move, upgrade, patch, and so on. The need for hardware/software appliances to ease deployment and management would greatly diminish as software appliances become the mainstream solution. Granted, these are big "ifs", but there are signs that such a future is possible.
Another future direction that at this point seems certain is the integration of virtualization into mainstream operating systems. The Xen community and its many branches have a good start on this, and the project appears to be headed toward being a de facto virtualization standard on Unix/Linux systems. Red Hat, SUSE, and Solaris are all in the midst of integrating Xen into their kernels. SUSE has already released SUSE Linux Enterprise 10, which includes integrated Xen, but overall Xen integration into kernels is in its early stages.
More hardware support for virtualization is a certainty as CPU vendors vie for the virtualization business. Sun has been using partitioning for a long time but recently added LDoms on the T1 CPU. Both Intel and AMD have announced future improvements to their CPUs to accelerate virtualization. The next step in this process appears to be "nested page tables", which will allow the system to maintain a per-virtual-machine page table view. This should greatly speed memory management. Beyond that step, it appears that support for accelerated I/O virtualization will be added.
If you are currently using Solaris 10, you should consider adding at least one zone to each Solaris 10 system you are deploying, then deploy all applications and build all user accounts inside that zone. Be sure that the applications are supported within a zone, as not all are. The result of this approach is that security and manageability is increased on that system, and because of the forthcoming mobility of zones, you'll be able to detach that zone from one system and attach it to another for upgrades and many other purposes. Of course, having the zone on shared storage will empower these activities.
Peter Baer Galvin (http:/www.galvin.info) is the Chief Technologist for Corporate Technologies (http://www.cptech.com), a premier systems integrator and VAR. Before that, Peter was the systems manager for Brown University's Computer Science Department. He has written articles for Byte and other magazines, and previously wrote Pete's Wicked World, the security column, and Pete's Super Systems, the systems management column for Unix Insider (http://www.unixinsider.com). Peter is coauthor of the Operating Systems Concepts and Applied Operating Systems Concepts textbooks. As a consultant and trainer, Peter has taught tutorials and given talks on security and systems administration worldwide. Peter now maintains a blog at: http://pbgalvin.wordpress.com.
|