Generating Linux Device Drivers with CodeSketch

John F. Hubbard

Automate device driver creation with this handy Linux tool.

Introduction

The goal of this article is to enable you to write and test Linux device drivers, quickly and efficiently, using a code generator and an ioctl-centered test program that I supply (available for download at <www.cuj.com/code>). This is of immediate practical use, not only for the veteran device-driver writer who wants to avoid copy-and-paste reuse of working device drivers, but also for the high-level C++ designer who simply wants to understand more about the foundations upon which he builds his object-oriented systems. A related, minor goal of the article is to demonstrate other uses of the code generator, as it can also generate other types of files.

Recently, I have been writing device drivers for Linux on Intel/x86 and LynxOS on PowerPC. Despite some years of paying my dues in both C and C++, I am still something of a beginner in the field of device drivers and kernel development. I have spent too much time working at the C++ and UML level, and too little time at the assembler and CPU-register level, to be considered one of the true experts. The positive side of this is that the "obvious" facts of life in this field still fill me with wonder, and I have a beginner's enthusiasm for trying to fix some of the problems that pop up [1]. Here are two observations from the new guy:

1. Device drivers are perhaps excessively tricky to write, install, build, and debug. (Of course, what is tricky to me is not necessarily so for a specialist.) I realize that this is most definitely not news to anyone, but it does seem odd that there isn't more interest in improving the situation. Like learning how to pronounce the street names in Honolulu (all of which sound about the same, until you've been there a few years), I suspect that navigating the steps of device-driver creation are considered something of a rite of passage. Much of the difficulties in device-driver development arise from "accidental complexities," rather than from subtleties of synthesizing the driver logic itself. These accidental complexities include poorly documented hardware, differences between kernel-mode and user-mode programming, the fact that you are changing the very operating system itself every time you add kernel code, a harsher debugging environment (less capable debugging tools combined with heavier consequences for inserting bugs), and the usual multithreading pitfalls, amplified by the weak debugging capabilities.

2. While device drivers are best written in the same language as that of the kernel -- this is invariably the C language -- the tools for testing a device driver can and should be written in a more powerful language. C++ is the natural choice here, because, as you will see, you can share header files between the device driver and the testing tools so that there is only the slightest friction between the two languages. In this case, the only cost of shifting between C and C++ is the use of the __cplusplus preprocessor macro. Implicit in this line of thinking is the assumption that built-in testing is absolutely required of any truly trustworthy device driver. Citing built-in testing as a primary design goal implies that you do not balk at inserting extra code to support running internal validation and inspection tests; this is code that would otherwise be unnecessary.

I realized that I could avoid some of the problems listed in observation #1, while gaining the testing benefits described in observation #2, were I in possession of a code generator that could create the right sort of fully working, but mostly empty, device drivers. Accordingly, I created CodeSketch, which does precisely that. It comes with a second program, called IoControl. CodeSketch generates the code, while IoControl allows you to exercise the built-in testing functionality that you cannot get to via simpler tools.

Device Drivers for Linux

Device drivers in Unix systems are object files that are linked into the kernel. The design is always the same: there are certain routines, called entry points, that the object file provides and the kernel calls. Building and testing a device driver, then, consists of assembling your entry points in a file, writing routines to connect these entry points to device-specific implementation code, and then concocting some method of testing it.

Figure 1 shows some of the entry points found on a typical Linux character device driver; these entry points coincide with the entry points supplied by CodeSketch. There are other entry points, but the device-driver writer may or may not chose to implement them, depending upon the driver's intended use. CodeSketch generates the necessary code to implement each entry point. The programmer must then associate the entry point with device-specific routines that actually communicate with whatever device she is actually supporting.

The entry points shown in Figure 1 include:

$ cat /dev/SomeDevice
whereas this command yields an open, a write, and a release operation:

$ echo "Hello there" > /dev/SomeDevice
$ cat /var/log/messages
in order to see the messages generated by the various printk( ) statements in the device driver.

These entry points correspond to those required of a character device driver, which is the only type that CodeSketch yet supports. Block device drivers are used to support file systems, and network device drivers support NICs (network interface cards). CodeSketch will generate these types of drivers in future versions.

Figure 1 also shows how to perform rudimentary testing of any CodeSketch-generated device driver, using a combination of echo, cat, shell file redirection, and my IoControl program. Advanced testing is best done via the general-purpose ioctl entry point, which is accessible from the IoControl program's command switch statement.

CodeSketch: A Programmer's Tool

CodeSketch is a tool that generates -- among other things [2] -- fully working device drivers for Unix systems. By this, I mean that you can install and use the resulting device driver. CodeSketch v1.5 is not yet clever enough to write code to manage your custom hardware, but most people prefer to see something working as early as possible in the development cycle, especially when working with code generators. With that in mind, CodeSketch generates sample code that uses a small chunk of kernel-allocated memory to simulate a simple RAM-based device. You can read, write, and seek within this small block of memory. If you were lucky enough to implement a device driver for hardware that did nothing more than provide some RAM-based storage, you could replace the CodeSketch-generated kernel memory allocation statement with your own routine to retrieve memory from your custom hardware. At that point you would be very nearly done with the job. The downside of this approach is that, for other types of device drivers, you do have to manually delete the sample implementation code; this seems a small price to pay in order to achieve a "works out of the box" system.

So far, CodeSketch generates drivers for Linux on Intel/x86 and LynxOS 3.1 on PowerPC [3]. The CodeSketch program itself, however, is written in highly portable C++, and it runs on essentially any platform that can compile C++. For a list of operating systems and compilers that I've tested so far, please see [4]. As for the generated device drivers, these have been tested against various releases of Linux kernel 2.4.

Figure 2 shows a session with CodeSketch. First, I generate a device driver using the full-blown CodeSketch command set:

$ CodeSketch -copyright "Copyright (C) 2002, BigCorporation"
-open_source -author "Alfred B. Constantine" -initials ABC
-class PressureSensor -char_driver -os linux
Next, Figure 2 shows a more sustainable approach of declaring a bash alias. This approach allows subsequent commands to be much shorter:

$ alias create_code= 'CodeSketch -copyright
"Copyright (C) 2002, BigCorporation" -open_source
-author "Alfred B. Constantine" -initials ABC -class '

$ create_code PressureSensor -char -os linux
You will find this sort of alias indispensable when using CodeSketch, as the latter is intended for use within project teams and therefore tends to be "flexible and powerful" rather than "trivial to use." You can instead change CodeSketch, but after considerable experience with project teams and ClassCreator [5], I would recommend that you leave this behavior essentially intact; it scales better.

The last part of Figure 2 shows the generation of three C++ classes; this demonstrates another use of CodeSketch, other than strictly for generating device drivers. The command:

$ create_code "DeflectionTest LatencyTest AccuracyTest"
uses the create_code alias and is therefore identical to typing this command:

$  CodeSketch -copyright "Copyright (C) 2002, BigCorporation" -open_source
-author "Alfred B. Constantine" -initials ABC
-class "DeflectionTest LatencyTest AccuracyTest"
This will generate a set of files (DeflectionTest.cpp, DeflectionTest.h, DeflectionTest.icc, makefile, and DeflectionTest_UnitTest.cpp,) to implement the classes DeflectionTest, LatencyTest, and AccuracyTest. As you can deduce, the first class in the list is used as the basis for the names of the generated files. You can override that behavior by using CodeSketch's -base_filename option.

Here is what each command-line option above means:

Examining the Generated Code

After running the commands in Figure 2, you will find yourself in possession of five new files, totaling 712 lines of code, comments, and build scripts. Of these, about 550 are actual code and script; the rest are comments and white space. Here is what the generated files provide:

Building the Device Driver

Figure 3 shows how to build and install the device driver.

You will note that the CommonBuildRoutines.bash is a set of bash routines that perform compilation and relocatable linking on device-driver files. I could have used the make utility to implement this, but chose not to based on the following reasoning: the make utility provides dependency checking and incremental builds, at the expense of providing an incomplete programming language. This is exactly the opposite of what is required here.

Device-driver developers never want incremental builds, because the cost-to-benefit ratio doesn't support this approach at all: C compilers are exceedingly fast, even when processing large source files -- and device drivers are not normally very large. As you can see from the commands in Figure 3, my experimental Linux machine, at 500 MHz, is far from state-of-the art, yet it builds the entire device driver in only 1.5 seconds!

Furthermore, with device-driver code, no one wants even a slight chance of anything less than an absolutely perfect build; remember, problems that are merely annoying in normal development are considerably more problematic when dealing with kernel code. After creating and using any number of build systems over the years, I finally concluded that in my world, every device-driver build must be a clean build. Accordingly, the makefile in this generated code merely calls the bash build script with the correct parameters; the bash script obliterates the output directory, recreates it, and then proceeds with a seriously clean build.

Exploring the Device Driver

Figure 4 shows a brief session with cat, echo, IoControl, and our new device driver. As you can infer from PressureSensor's responses, the driver simply allocates a kernel buffer, retains it across all system calls, and allows you to read, write, and seek within the buffer.

Listing 1 shows the implementation of the read, write, and llseek entry points. As previously described, these all depend upon the presence of the small kernel buffer, in order to simulate a fully working device driver.

Summary

After realizing that CodeSketch will be evolving for years, I have decided to set up a website that allows people to get the latest ideas and perhaps (time allowing) even post bug fixes and new features. Assuming that all goes well, you may find this at <www.hubbard-software.com>. The purpose of this site is the dissemination of freely available, open-source software tools, primary those of my own creation.

Notes and References

[1] I am indebted to Alessandro Rubini and Jonathan Corbet, the esteemed authors of Linux Device Drivers (O'Reilly & Associates, 2001), for writing a book that helped me become much less of an beginner in this field, in a reasonably short period of time. If you are working with Linux device drivers, in any role other than that of kernel.org developer, then you probably already have this book.

[2] CodeSketch can also generate C++ classes and C routines. Each generation run includes makefiles and unit tests (unless you specify otherwise) so that you always can type make (to build everything) or make local_test (to build and run) immediately after generating any type of code.

[3] LynxOS is a Linux-like, POSIX-compliant RTOS, created by LynuxWorks (<www.lynuxworks.com>). Recent versions of LynxOS even offer ABI (Application Binary Interface) compatibility with Linux for reasons perhaps best understood only by God, LynuxWorks, and whoever they hired to do a market analysis. That aside, LynxOS is one of the more accessible and user-friendly RTOSes that you'll encounter.

[4] CodeSketch has been built and tested on the following compiler/operating system/hardware triplets: gcc 3.2 on Linux (Redhat 8.0) kernel 2.4, Intel/x86 hardware; gcc 2.95.3 on Linux (Redhat 7.3) kernel 2.4, Intel/x86 hardware; gcc 2.95.3 on Solaris 8, SPARC hardware; gcc 2.95.3 on Solaris 7, SPARC hardware; and Microsoft Visual C++ 6.0, Service Pack 5, on Windows 2000, Intel/x86 hardware. Astute observers will notice that LynxOS 3.x is not listed here; sadly, its version of gcc is too old to compile modern C++ unassisted. However, the current version of LynxOS is now 4.x, and that ships with gcc 2.95.3, which will work just fine.

[5] John F. Hubbard. "Building a Professional Software Toolkit," C/C++ Users Journal, May 2001. This introduced the CommandLine class and the ClassCreator utility for generating C++ classes. CodeSketch reuses CommandLine; everything else is a ground-up new design. CommandLine has endured three years of heavy use with essentially no changes; borrowing from Brian Foote's lively article "The Selfish Class" (<www.laputan.org/selfish/selfish.html>), I attribute this to a high surface-to-volume ratio, combined with moderately intricate implementation code that no one really wants to touch.

About the Author

John F. Hubbard spent eight years as a nuclear submarine line officer, mostly out of Pearl Harbor, Hawaii, before venturing into the foreign waters of civilian software development. He currently works as a senior software architect, at a company that specializes in real-time programming, embedded systems, and factory automation. His own specialties include compiler design, networking and interprocess communications, operating systems internals, software tools, and Unix system administration. Mr. Hubbard holds a BS in Electrical Engineering from Utah State University. He may be reached at hubbardjohn@earthlink.net.