Dr. Dobb's Journal September 2006

Is It Real-Time Yet?

by Ed Nisley

Ed is an EE, an inactive PE, and author in Poughkeepsie, NY. Contact him at ed.nisley@ieee.org with "Dr Dobbs" in the subject to avoid spam filters.

I recently installed the latest version of the Enhanced Machine Controller (EMC) software for my Sherline milling machine. Unlike previous versions, the EMC code now runs happily in standard Ubuntu Linux with RTAI generating hard real-time stepper motor control signals. You can even stream Internet music and gnaw metal without missing a beat or crashing a cutter.

The Mobile and Embedded Linux track of the April LinuxWorldExpo presented the current state of the real-time Linux kernel effort and, en passant, showed why kernel patches may trump a separate real-time layer for most applications. After some poking around, I think I now understand the implications of the two approaches and, as usual, the technical aspects have less to do with the outcome than you might expect.

But the tech makes a nice starting point, so I'll begin there.

The Many Faces of Real-Time

In June 2005, Paul McKenny posted a note to the Linux Kernel Mailing List summarizing seven different ways of getting real-time performance from the Linux kernel. Although he expected a flame fest, the responses indicate that he'd accomplished that most difficult feat—explaining a contentious topic without offending anyone.

The methods range from simply using the stock kernel, for applications with very relaxed timing requirements, to a single OS image distributed across multiple real or virtual CPUs, with one or more dedicated to real-time threads. It seems most techniques have strong partisan support, coupled with a nearly complete lack of comparable measurements to support decisions.

While describing the real-time kernel patches, Sven-Thorsten Dietrich showed results at LWE indicating that the stock 2.6 kernel can respond to an external interrupt within a few milliseconds, Montavista's fully patched 2.6 kernel on an 800-MHz Celeron can provide 100- to 200-microsecond latency, and a 3.2-GHz Pentium pushes that down to 65 µs.

The lowest latency can be just a few microseconds if the interrupt occurs at precisely the instant when the kernel's state allows an immediate response. The jitter between the fastest and slowest response thus nearly equals the worst-case latency, which can rubbleize an algorithm that depends on regular timing in the real world.

For example, the stock kernel can sample an external voltage every 500 ms with no trouble at all: The jitter would be 1 percent of the sample interval. Cranking the sampling interval to 50 ms, however, raises that same jitter to 10 percent of the nominal period. The digital representation of an analog signal sampled with that amount of jitter would bristle with spurious frequency components.

The next step in McKenny's taxonomy describes a nested OS, with a real-time OS running a nonreal-time Linux kernel as a separate task. That's the general idea behind RTLinux, RTAI, and Adeos, although each applies a different mindset.

A separate hard-RTOS kernel provides latency in the 10-µs range for moderately peppy x86-class CPUs, with correspondingly reduced jitter. Sampling a signal every 1 ms with 1-percent jitter becomes entirely feasible, although communication between tasks on different OSes may be difficult, depending on how they're lashed together. As always, stability and ease of communication vary inversely.

The remainder of McKenny's list includes the dual-OS approach, which runs an RTOS on one CPU and Linux on another, a parallel-OS mechanism that moves tasks between real-time and nonreal-time kernels running on the same CPU, and the bizarre (to me, anyway) single-image/multiple-CPU kernel that hasn't been implemented yet.

In short, the answer to the question of whether Linux provides real-time performance depends on what, exactly, you mean by real-time. For any latency requirements over a few tens of microseconds, you can find some combination that will work for you.

Performance Measures

The old rule of thumb that real-time performance increases with each new CPU generation has become invalid, because CPU clock speed no longer doubles every few years. Although multicore CPUs have higher overall throughput, each core has performance that's roughly comparable with current single-CPU chips. The number of instructions and I/O cycles required to handle an interrupt won't decrease and the clock frequency won't increase, so the best-possible latency will remain around 10 µs for the foreseeable future.

Interrupt latency and jitter get plenty of attention, mostly because they're easy to measure and summarize on nice graphs. However, the path length from a hardware interrupt to the first instruction in your handler represents just one part of real-time performance; other components probably have a greater effect on overall performance.

If your code performs only a trivial function, such as reading or writing a port using data stored in a FIFO buffer, then OS code determines the total path length from the interrupt to the final operation. Suppose, for example, that the total time required to handle a single interrupt has 10-µs latency plus 40 µs of hocus-pocus after your code, for a total of 50 µs. The maximum repetitive interrupt rate could hit 20 KHz as long as the system does nothing other than handle interrupts.

That might be acceptable for a trivial real-time application such as an arbitrary waveform generator that can precompute a table of output values. Most applications have a husky nonreal-time component handling the user interface and converting input samples into output values, so the real-time component must not completely hog the CPU.

That tradeoff highlights the essential difference between a hard real-time OS and all others: A complete lack of fairness. Desktop and server OS design assumes that all tasks should receive their more-or-less fair fraction of the CPU's total processing power. Priorities and scheduling tweaks may favor one task or task category over another, but everybody has an opportunity to play.

An RTOS, in contrast, ensures that real-time tasks start when they must and run to completion as they should. Unless and until a real-time task yields control, it's in charge. There are various nuances and tweaks that may reduce such absolute control, but that's the starting point.

So, although it's tempting to use the only hard numbers you'll find, things like interrupt latency and task-switch speed might actually be the least of your worries. You must, instead, know numbers that trendy software development methodologies can't predict: The path length through your real-time code and the actual CPU usage of your nonreal-time code.

Good luck with all that.

Real-Time Backstory

Last month I mentioned how virtualization has simplified data center operations by running multiple OS guests on a single physical CPU. In essence, a small, relatively simple (in OS terms, anyway) chunk of code virtualizes the underlying hardware so that each guest OS thinks it's in control.

This resembles McKenny's nested OS technique, with a real-time OS controlling the hardware normally handled by the nonreal-time OS. While the real-time OS does not completely virtualize the underlying hardware, the general notion dates back to at least the 1970s, so it's fairly well established.

Despite what seems like prior art, Victor Yodaiken was granted patent 5,995,745 in 1997 for "Adding real-time support to general purpose operating systems," which describes how "A general purpose computer operating system is run using a real-time operating system." He used the technique as part of RTLinux and formed FSMLabs to provide a commercial version.

FSMLabs grants a free license to use the patent in GPL projects and for proprietary projects under specific circumstances. Other projects, including those under the LGPL, must negotiate for and pay a licensing fee to use the patented technique or simply buy and use the commercial version of RTLinux from FSMLabs.

One condition for incorporating RTLinux fee-free in commercial projects is that you can't modify it. That means users can't feed changes back to the source, which means updates must come from FSMLabs. Perhaps not unexpectedly, the most recent download file was produced in May 2005.

John Gilmore observed that "The Net interprets censorship as damage and routes around it." It seems Free Software treats patents in a similar manner by programming around them. That's not always possible, but in this case, it seems to be playing out that way.

The RTAI project, which was originally modeled on RTLinux, rewrote much of its code to avoid the '745 patent and released much of that under the GPL, rather than the previous LGPL. The most recent version dates to January 2006, so it's being updated and used by GPL projects, some of which might also meet FSMLab's patent license provisions.

The Adeos project implements an IRQ-passing technique that's supposed to be entirely exempt from the '745 patent's claims, using a "nanokernel" to provide real-time support for OS domains. The acronym stands for "Adaptive Domain Environment for Operating Systems," but given the project's starting date and overall context, it looks like a deliberately misspelled Spanish pun. Its most recent files appeared in May 2006.

The Xenomai project began as a way to ease the transition to Linux from proprietary RTOSes by putting an abstract RTOS-service layer atop the underlying real-time code. It merged with the RTAI project to become RTAI/Fusion, then popped out as Xenomai some years later. The most recent downloads are from May 2006, so it's also progressing nicely.

Because all of those projects rely on a separate lump of code taking charge of the hardware below the Linux kernel, litigation represents the only way to verify that they're not infringing the '745 patent. As nearly as I can tell, FSMLabs has never initiated a suit against anyone, but the mere prospect seems to have encouraged development of kernel patches that cannot possibly be mistaken for separate real-time OS functions.

The Linux kernel's real-time preemption patches appear on a more-or-less weekly basis, closely tracking the current kernel code. While they do not convert the kernel into a hard real-time OS, the result may well be close enough for most projects, with the added advantage of integrating the real-time environment into the mainline code. Basically, if your biz can use Linux, the real-time code tags under the same provisions and without any further legal hocus-pocus.

The usual caveats apply: I'm not a lawyer and I have nothing to do with any of the projects. If you're in the market for a real-time OS, plan on some serious research into the tech, legal, and financial aspects of the deal.

Reality Checks

In the May column I asked if boot-time software diagnostics had ever flushed out a hardware problem. I felt that failing hardware would also wipe out the tests, leaving an inert lump.

Paul Bristow reports that he "wrote the code for a real-time system that also executed the power-on test as its idle task to keep the processor doing something useful for...98 percent of its time." The test performed a CPU check, a nondestructive RAM test, and verified the ROM checksum, over and over and over again. Months passed with no errors.

"But then one day, on one system," he says, "I got an error message—checksum wrong on the ROM. Two weeks later, [a] similar message. And over the next few weeks, messages steadily got more frequent—but the equipment still worked perfectly, controlling the hardware and talking to the operators, and doing the test.

"Examination of the board revealed the cause of the problem—it was an erasable ROM containing the running code—and there was no sticker to stop the light starting to erase it. So the stored ROM code was becoming iffy.

"So the ROM was intermittently producing a wrong byte or two, but never with high enough frequency to show up in the running code!"

If I'd never occasionally forgotten to put a sticker on my EPROMs, I'd be in a position to throw stones, but, well, BTDT.

Chris Sawran, while working on an 8051 USB Mass Storage gadget, discovered "that at seemingly 'random'...times, especially during a data file DMA transfer, the DMA would stop."

He "eventually found that under some obscure situations, the GPIO direction and/or data registers were getting cleared..." and "After many tests, and many more explanations, I finally managed to capture the bug with a small test case.

"When the CPU was pushing arguments onto the stack, if the argument being pushed to a data-space address...happened to have the same SFR address [as] a specific set of MCU registers, then a glitch in the hardware clock trees assigned the value on the data lines to both the RAM address and the SFR register simultaneously."

After the hardware guys took a look at the design, they discovered, much to their horror, that "the defect was shown to also exist in the previous generation chip, which was selling and now 'out in the field'."

Pop Quiz: What would you do in a situation like that? Analyze the code to determine whether the error could occur in real life? Recall the product? Issue a service notice? Bury your findings and hope nobody notices?

Extra credit: Estimate the expected liability cost of each course of action. Assume all documents you produce as part of the estimation process will be used as evidence against you.

Chris observes "it's tough to really trust any of your code again after that..."

Truer words were never spoken!

Last Tab

The EMC project is at www.linuxcnc.org, Sherline is at www.sherline.com, and my original description is in the May 2004 DDJ. Stream and buy music from www.magnatune.com to avoid DRM-protected bits.

Paul McKenny's summary of Linux kernel RT methods is at lkml.org/lkml/2005/6/7/256. Another overview of Linux real-time approaches is at www.isd.mel.nist.gov/projects/rtlinux/intro-rtl.pdf. A version of Sven-Thorsten Dietrich's Linux Real-Time Technology presentation is at www.enseirb.fr/~kadionik/embedded/SRO853_DIETRICK.pdf. Montavista presents some benchmark results at www.mvista.com/products/realtime_benchmarks.html.

More background on virtualization at en.wikipedia.org/wiki/Virtualization. Find Yodaiken's patent 5,995,745 at patft1.uspto .gov/netahtml/PTO/srchnum.htm. More on the patent's licensing at www.linuxdevices .com/articles/AT2094189920.html. Quotes about the Internet are at cyber.law.harvard .edu/people/reagle/inet-quotations-19990709 .html.

Ubuntu is at www.ubuntu.com and if you favor KDE over GNOME, try www.kubuntu.org. The RTAI project is at www.aero.polimi.it/ ~rtai/index.html, Adeos is at http://home.gna .org/adeos, and useful examples of both are at www.captain.at. FreeRTOS is at www.freertos.org. RTLinux is at www.fsmlabs.com and www.rtlinuxfree.com. Ingo Molnar maintains the latest real-time kernel patches at people .redhat.com/~mingo/realtime-preempt.

Thanks to the SUNWAH LINUX folks (www.sw-linux.com) at LWE for giving me their very last overstuffed penguin mascot. They provide various Chinese-language Linux distros, embedded programs, and Asian-character-set translation routines, none of which I'm competent to evaluate.

DDJ