Dr. Dobb's Journal August 2006

Linux for Corporations

by Ed Nisley

Ed is an EE and author in Poughkeepssie, NY. Contact him at ed.nisley@ieee.org with "Dr Dobb's" in the subject to avoid spam filters.

The clearest indication that LinuxWorld Expo, held last April in Boston, isn't a gathering for programmers was the decaf soda offered during the breaks. Even the caffeinated soda floating in the tubs was ordinary Pepsi rather than full-metal-jacket coding-session fuel.

Perhaps that's just the difference between corporate development practices and the Other Side?

As with last summer's Linux Symposium, the overall message was that the Linux kernel and its Free/Open-Source Software entourage has gone mainstream. Unlike the early, heady days, strong corporate support and participation is now pushing the code into large data centers and office deployments. This need not be A Bad Thing, but it does invalidate some widespread assumptions about what's important to the coders.

Even though blade servers may seem disconnected from the world of embedded programming, there's an interesting link that hasn't gotten a lot of attention. Here's how some of it played out in Boston.

The Blade of Economics

The notion that an operating system should mediate hardware access among myriad application programs has recently collided with Moore's Law. Standard data-center practice now runs a single application program, perhaps a web or database server, on a single CPU, with the OS relegated to providing memory management and I/O control. Although the system may have an assortment of daemons running in the background, the hardware is dedicated to a single application that can use the entire CPU without contention.

While Moore's Law makes those CPUs cheap and readily available, the real justification comes from reduced system administration effort. Configuring, verifying, troubleshooting, and maintaining a nontrivial server application requires so much effort that the complication of running multiple servers on a single system just isn't worth it, particularly when the underlying hardware occupies a single blade in a server rack.

Most servers have an average load far smaller than their peak load, with numbers at the conference suggesting 10 to 20 percent as typical. Because the hardware literally has nothing else to do, the OS spends much of its time doing nothing at a few billion instructions per second. A corporate data center seems not the place for idle-task apps that fold proteins, crack prime numbers, or search for signs of intelligent life.

That much idle hardware may not have been desirable, but it became indefensible when large data centers collided with another economic reality—the power company. It turns out that electrical power is now the largest expense for many data centers, outstripping even the amortized cost of the server hardware. The total electrical load for large-scale centers is in the multimegawatt range, with essentially all of that power becoming heat that must be removed by chilled-air handlers. In fact, access to power and cooling may limit the number of racks a data center can reload with next-hardware generation.

Here's a useful number: With electricity priced at $0.12 per kilowatt-hour, an always-on device dissipating 1 W costs $1 per year. That wall-wart cell-phone charger that you leave plugged in under your desk costs five bucks a year and your fancy LCD panel burns eight bucks a year when it's turned off. Run your own numbers and see, but you're spending closer to a buck a watt a year than you might think.

On the high end, a 24/7 data center with a megawatt of servers and chillers uses a megabuck of electrical power every year. There may be a bulk discount, but any bean counter worth her salt will object to having only a fifth of that tab produce billable results.

So, if you can't run more than one OS per blade and more than one application per OS, can't get more than 20 percent hardware utilization, and can't afford to waste megabucks on idle megawatts, what can you do?

The answer seems to be virtualization—time-sharing entire operating systems on a single hardware CPU. Sounds crazy, but it's basically the only way out right now.

Virtualization

Operating systems present a more-or-less standardized interface to programs that run on them, so that those programs need not worry about grisly hardware details. The ideal situation allows an unchanged program to run on different hardware mediated through the OS. In practice, it never works quite that well, but the notion of an abstract hardware interface is well ingrained these days.

Virtualization takes this to the limit by emulating the underlying hardware, with the result that a program written for one hardware architecture can run on an entirely different system. Aficionados of old video games can run the original, unchanged code on emulators that simulate the CPU and its peripheral devices with sufficient speed that the programs actually run faster than on the original game box.

Performance gains like that aren't generally the case, particularly when emulating hardware that's roughly as complex as the actual CPU running the emulator. For example, an 8-bit Z80 microcontroller emulator runs at blistering speed on any current x86 CPU, but trying to emulate, say, a 32-bit Blackfin DSP just isn't a productive use of anyone's time.

The Intel 80286 CPU introduced a "Virtual 8086" mode that could run unmodified 8086 programs at essentially full speed. The trick was to run nearly all instructions on the actual hardware, with a supervisor program gaining control on hardware interrupts and system calls. Because the existing 8086 instruction set was a proper subset of the new 80286 architecture, that approach worked surprisingly well.

Several talks described the Xen hypervisor, which emulates an x86 CPU using a technique called "paravirtualization" to create a virtual machine (VM) that looks precisely like the underlying hardware. The operating system within that VM must be aware of (and cooperate with) the hypervisor, but user programs run at essentially full speed because those instructions execute without software intervention. The net performance hit seems to be a few percent at worst, far better than a pure software emulator that must grind out the results of every instruction.

The hypervisor gains control on timer ticks, hardware interrupts, and some system calls, handles the situation, and returns control back to the VM. If this sounds like a rudimentary operating system, that's precisely what it is, except that it's maintaining the state of a simulated hardware environment rather than controlling user programs.

Of course, after you get one virtual machine running, the hypervisor can juggle several VMs on the same CPU hardware by time-sharing the hardware, handling global memory allocation, and enforcing isolation. I/O passes through the hypervisor, which must present the dual illusions that the VMs are in control of the I/O and that the I/O goes directly to the VMs.

Suppose you start with four blades running single applications at 20 percent utilization. Set up a hypervisor on one blade, create four VMs that timeshare the underlying hardware, and suddenly you have one blade running at 80+ percent utilization and three idle blades.

Shazam! Dispose of 75 percent of your hardware or find something remunerative with all that spare capacity.

VMs do not come up smelling entirely of roses. Funneling sufficient I/O through a single hardware server, providing enough physical memory, and sysadmin duties on the OS image within each VM can be a challenge. However, the economics seem to be compelling enough to attract plenty of high-level attention.

Hold the thought of an entire OS running within a VM for a moment.

Digital Rights Management

A talk by Jeff Ayers of RealNetworks posed the no-doubt rhetorical question of "Linux DRM: Possible or Oxymoron?" and enumerated some of the technical points required to make DRM workable in an Open Source environment.

He pointed out that movie companies take the long view: Gone With The Wind has commercial value 67 after it first lit the silver screen. At the same time, current DRM can only prevent casual piracy and fair use by the likes of you and me, because the factories that churn out bootleg DVDs have sufficient resources to crack any encryption in short order.

The central issue, as I see it, is that the rights for a work have come to be strictly those of the originator, not the purchaser or user. As a result, the whole notion of DRM boils down to preventing access to the protected bits, regardless of the purchaser's rights, while simultaneously allowing access to the bits for specific uses. This isn't going to work, as those factories show, but that hasn't stopped a whole bunch of smart folks from trying.

The overall DRM process seems straightforward enough: Encrypt a set of bits so that only an authorized program can decrypt them. The devil, as always, is in the details, as the Sony rootkit debacle demonstrated.

As Ayers explained, ensuring that a program (or music or book or whatever) remains protected requires running the decryption program on an authorized platform. That platform, both hardware and software, must have a known configuration, which requires some sort of central certification.

Unfortunately, that means typical Free/Open-Source Software simply cannot be allowed to run, because it could readily circumvent the DRM by meddling with the decryption routines. If, however, you regard Open Source as just a cooperative development methodology rather than a component of Free Software philosophy, then you can certainly develop a system that will prevent users from recompiling any code without DRM routines. Asserting that community developers (that is, those who aren't getting paid to produce DRM routines) would participate in such a project seems, at best, a dubious proposition.

This view of Open Source DRM is TiVo written large: You can recompile the source, but your modified version won't run on your hardware without the manufacturer's authorization, which you're certainly not going to get. You may be able to do other things with the hardware, but not run any code that decrypts DRM-protected bits.

The alert reader will notice the resemblance between this view of DRM and the whole Trusted Computing initiative I described in January: The hardware verifies the integrity of the software before booting, then each software stage verifies the next before yielding control to it. Only approved software may use the hardware, so all the bits remains solidly locked down.

Hold that thought, too.

Game Boots

The world of game consoles is rife with copy protection that starts with firmware buried in the system board's chipset. Defeating that DRM has become something of a cottage industry, primarily in places where the DMCA lacks teeth, and the bottom line seems to be that each iteration of copy protection lasts for, at most, a few months.

A press release folder from Gamix, the Open Game Platform, appeared in the pressroom. They didn't seem to be exhibiting or presenting anything at the Expo and an e-mail didn't elicit a reply, but the general notion seems to be that they want to become the certifying authority for a set of specs that define a game console based on more-or-less standard, albeit low-end, PC hardware. If you want to manufacture hardware or software to that spec, you can buy the rights to use their compatibility logo so purchasers will know everything will work together.

The games arrive on a bootable CD or DVD that runs on the predefined hardware, with complete control in the hands of the game programmer. The demo CD boots Mandriva's LiveCD and runs dedicated versions of SuperTux and NeverBall. That demonstrates the concept, even if they're not really state-of-the-art games.

I'm unsure of the viability of Gamix's business plan, not least because the success of any particular game depends less on its platform than the marketing hoopla (and money) behind it. In any event, you can release a self-booting game right now and be reasonably sure it will run on any current PC. What value the Gamix marque brings to that process remains to be seen.

What value Linux contributes is obvious, though. You can distribute not only your own application, but also the entire operating system underlying it. You may charge for your application or give it away, so long as you observe the proprieties of the licenses, including the GPL, for all the other code on the CD.

Got that thought, too? Good. Now, here's how it all ties together.

Free-Range DRM

What do you get when you combine the notions of a hypervisor running a single-application OS in a virtual machine, the desire for high-strength (yet user-friendly) DRM, and a CD that Just Runs on any commodity machine? Possibly the future of DRM on consumer-grade home entertainment systems, which may or may not be something we recognize as a personal computer but that certainly qualifies as an embedded system.

Current laptops contain the Trusted Platform Module that can verify the boot process and pass control only to trusted (pronounced "approved") programs; that same technology can easily appear in an entertainment center. If the hypervisor on a self-booting CD or DVD were approved to run, it could start up a VM containing an OS and media player that decrypt DRM-protected bits. Because the VM is TPM-protected from power-on, software from those pesky customers won't have an opportunity to intervene and, because it runs all I/O through the hypervisor, there's no user configuration.

The bootable disc need not contain the protected bits because the OS within the VM can set up an Internet connection to siphon the movie (and perhaps even a specialized player) from the appropriate server. The precious bits vanish into the VM and reappear on protected digital and audio links to the video display and speakers. No unapproved software gets an opportunity to touch any unencrypted bits.

But, should you want to do something else with your entertainment center, perhaps even boot a Linux distro, you're free to do so. You can even write, run, and debug your own programs and those of others, GPL and all. The one thing you can't do is run a program that inhales protected bits and exhales movies or music.

Even this won't protect DRM-encrypted bits for the many decades the owners seem to want, at least given the industrial-level cracking applied to current DVDs. But it just might deflect enough criticism to make DRM palatable to enough people to get it into media-playing boxes throughout the world.

I'm sure this isn't an original connect-the-dots exercise, but I haven't seen any work along these lines. If you know more, drop me a note, eh?

Last Tab

You can see a piece (and buy the rest) of a 1971 paper on the EDVAC operating system at www.priorartdatabase.com/IPCOM/000129981. Get more OS history at www.netnam.vn/unescocourse/os/12.htm.

More on the Multiple Arcade Machine Emulator project at www.mame.net, with an X port at x.mame.net/faq.html. Blackfins live at www.analog.com/processors/processors/blackfin/ index.html.

Searching for "dream drm" at www.realnetworks.com isn't productive, but Google is your friend.

Find an updated version of the classic "Xen and the Art of Virtualization" paper at page 65 (aka 73) of the 2005 Linux Symposium proceedings at www.linuxsymposium.org/2005/linuxsymposium _procv2.pdf. The official Xen nexus, with whitepapers and suchlike, is at www.xensource.com.

Other virtualization folks use different techniques to achieve much the same goal. VMWare's Virtual Appliances project at www.vmware.com/vmtn/appliances shows what can be done with a bottled VM on stock hardware.

Gamix seems to be at www.gamix.org, but the level of detail leaves something to be desired. It evidently has no relation to gamix, the Gnome-ish Audio Mixer.