February 1995/Embedding on a Budget

Real-Time/Embedded Systems

Embedding on a Budget

Jeff D. Pipkins

Jeff D. Pipkins is a senior systems engineer at Compaq, where he writes firmware for intelligent option boards. He has been programming in C since 1985, and various assembly languages since 1979. His current interests include systems software tools, embedded OS's, and building small projects with embedded microcontrollers. He can be reached via Internet mail at pipkins@bangate.compaq.com.

Introduction
Let's say you want to burn some bits. You want to etch your code into the silicon, but you want to avoid massive collateral damage to your wallet. Fortunately, it is possible to embed C code into ROM without purchasing expensive, specialized tools. This small embedded project will get you started.
To keep the cost down on the software tools, I used the Microsoft C compiler and assembler, simply because that's what I happened to have on disk at the time. You could use other compilers instead with a little modification to the code presented. You'll also need a device programmer to burn those bits into your PROM, EPROM, Flash ROM, or whatever you choose. I've seen them for as little as $130.
I could have chosen an exotic target platform, but to conserve cash, I chose some inexpensive hardware that I had lying around the house. It's an old IBM XT, sporting a classic 8088 processor at 4.77 MHz. There actually are some good reasons for choosing such a machine, beyond simple economics. Several processors aimed at the embedded market are basically composed of an 8088 core augmented with on-chip devices. The 80188 has become well-established as an embedded processor, and it's available in a seemingly endless variety of incarnations: the 80C188XL, 80C188EA, 80C188EB, and 80C188EC, as well as low-voltage and 16-bit versions of these, some so new that this year's databook on them is marked "preliminary!" So if you want to get in on today's embedded world, just grab yesterday's computer dinosaur at a ridiculously low price and go for it!

A Simple Project
The purpose of this article is to show how to embed C code without expensive tools. I present a small program to be embedded (Listing 1) , along with the other pieces needed to make it work. The program is a slight variation on the time-honored "Hello World," modified to work in an embedded environment.
The built ROM image works without BIOS, MS-DOS, or anything else. It's completely self-sufficient, and takes control immediately when the machine is powered on.
The make file (Listing 2) handles a large part of the embedding task in this project; it invokes the linker, exe2bin, and even debug to build a ROM image capable of sending a greeting out a serial port. I explain this process in detail later.
Another piece of code is required to get an image running once it's embedded in ROM. This custom assembly-language startup code (Listing 3) sets up the run-time environment needed for hello.c to run. Rather than give a line-by-line account, I provide here a general explanation of what this code must do. Finally, I've written a really stripped-down version of stdio. c, which supports output to a serial port. My stdio.c, is not listed here, but is available on this month's code disk.

Building the Program
For many programmers, the build process is just a tedious detail. However, if you want to build a ROM image, you need to be just as skilled at building code as you are at writing it. After all, the compiler is just one tool in the chain. To build the ROM image, I also use the linker, exe2bin, and debug. Chances are you' ve used all of these before in a cookbook fashion, but now you'll need to take a closer look at what these tools actually do behind the scenes so you can use them for the atypical task of creating a bootable ROM image.

Poor Man's Locator
When you're writing MS-DOS programs, you usually don't have to worry about where your code will be loaded, or even whether it's relocatable. Relocatable code can be executed at any address, because it has no references to absolute addresses. COM files are good examples of relocatable code. You can just copy COM files anywhere, set up the segment registers, and jump in. The code in EXE files is not always relocatable — it may contain absolute references to code and data. So how does MS-DOS load EXE files wherever it wants? The EXE file contains a relocation table (sometimes called a "fix-up" table). This table shows all of the absolute references in the image. After the loader copies the image into RAM at a particular location, it adds the segment address of that location to every absolute reference in the image. This process has been called "locating," "relocating," "performing fix-ups," or "address binding". (See Figure 1. )
If an EXE file's relocation table is empty, then there are no absolute references, and the code is relocatable. Such a file can be converted into a COM file (which is just a straight, relocatable image) using the exe2bin program. (exe2bin is readily available; various versions have shipped with MSDOS, the MSDOS technical reference manual, MSC, and MASM.) exe2bin's original purpose was just to convert relocatable EXEs into COM files, and if the EXE file had any entries in its relocation table, exe2bin would just complain and give up. Later versions of exe2bin are more useful. If it finds entries in the relocation table, it prompts you for a base address, and then uses it to perform the fix-ups!
So exe2bin now does basically the same thing that the MS-DOS loader does, except that it writes the resulting bound image to a file instead of executing it. exe2bin will serve as a "poor man's locator." A more fanciful locator would allow us to specify a different base address for each different segment, so that the image would not have to be contiguous. Since segment information is not contained in the EXE file, such a product generally has to do the linking as well just to keep that information. That's why they're sometimes called "Linker/Locators." An alternate approach might be to parse this information from a MAP file that's generated by the linker.

Linking
According to a tradition older than the 8088, the C compiler divides a program into segments. The compiler usually creates a code (or "text") segment, a data segment for ininitialized data, a "bss" segment for unitialized data, and a stack segment. MSC adds several other segments, such as a "null" segment, to which null pointers point, a "const" segment for constants, and various segments for relatively new inventions, such as the "far heap."
The linker combines like segments from many object modules. (See Figure 2. ) Even though each object module may have its own code segment, the linker combines them all into a single contiguous segment. In addition, the linker allows object modules to refer to code or data in other modules by linking external references together, which of course is where the linker gets its name.
The linker also determines the order of the segments in the final image. In this project, the startup code must be the first object module for input to the linker (as shown in the make file, line 52). The linker maintains the same segment order that first appeared in the first file. I make use of this convention to control where the segments are loaded in relation to each other. Controlling this order is very important, since this project requires the segments to be in a different order than usual.

Segment Order for Embedded Code
Moving your code into ROM places additional constraints on where things need to be, and it's the job of the startup code (Listing 3) to see that everything is in its place.
Before the C code executes, its data segments must be copied into RAM. The executable code must remain in ROM so that it won't consume RAM space. For this reason, all data must initially occupy a block located at the beginning (lowest) location in the ROM image, so that all data offsets will be correct when DS is set to the beginning of the RAM data area. When exe2bin asks for a fix-up base, the make file supplies the segment address of the RAM data area where the data will be after the startup code copies it. This will make far data pointers correct. Since the first 256 bytes of RAM are reserved for the interrupt vector table, I chose to put the RAM data area at 0x40:0, which is just above that.
Since the code stays in ROM and the data is copied to RAM, the image becomes discontiguous. This creates a new problem in that now, instead of just one fix-up base, we really need two — one for the code and one for the data. Unfortunately, exe2bin allows only one fix-up base to be specified because it assumes that the image will be contiguous when it executes. If we specify the RAM data area segment address as the fix-up base, then far code pointers will be incorrect, and if we specify the ROM as the fix-up base, then far data pointers will be incorrect. I deal with this dilemma by using the compact memory model, which allows up to 1 MB of data but only 64 KB of code. Thus, all generated code is relative to a single CS register value, which will work as long as the code contains no far calls or jumps (no far functions, and no far function pointers).

More on Segment Order
As already mentioned, a program's data segments must be copied to RAM before the program executes. The order in which these segments end up in RAM is also important. Most of the segments will belong to what is known as a group, which is a bunch of segments that will all be referenced using the same segment register at run time. The compiler assumes there will be a group called DGROUP, which is referenced by the DS register. The NULL segment is first so that DS:0 will point to it, and the BSS segment and stack are at the end, so that the near heap (if there is one) and stack can contend for space (this is called a "parasitic" heap, since it feeds on stack space). However, there is no near heap in this example. A heap exists only as supported by the library, and the program can't use the malloc that comes with the compiler because it depends on MS-DOS calls.
If you need a heap, you can write your own malloc and free. Should you decide to do this, I'd recommend inserting another segment between BSS and stack to make it easier to reference the end of the BSS segment. For this example, the stack is part of DGROUP and SS == DS. If you prefer, you can throw a compiler switch that removes the SS == DS assumption, and then remove the stack from DGROUP. Removing the stack from DGROUP would give you more room for near variables, but you'd have to make sure you set up SS properly. Finally, the far data belongs after the stack, possibly out of reach of the DS register.

Execution — Eventually
To me, part of the excitement of using an embedded system is being on your own. No power-on self test, BIOS, or OS will get control before your code. There's no one else to lean on. The processor comes to life, and you're in charge. This is all great fun, but it imposes still more constraints on how the program's constructed and loaded.
When the processor powers up or resets, it begins execution at F000;FFF0, which is the last !6-byte paragraph of address space (see sidebar for differences in 286 or later processors). The ROM image must have code strategically located in this spot. Since the program has only 16 bytes to work with, it will have to jump elsewhere, and since elsewhere may be more than 64 KB away, the program must do a far jump. Recall, however, that far jumps won't work — they won't get fixed-up correctly. So this time I just cheat by using debug to patch in a jump instruction to an absolute location. The jump requires no fix-up because the address is absolute, and exe2bin doesn't complain because I use debug after exe2bin (see Listing 2, line 74).
Now the problem is, how can I supply an absolute code location for the far jump, since its location depends on where the data ends? I solve this problem by putting a small amount of code at the very beginning of the ROM image. Since the beginning of the ROM image is a constant, it's easy to insert a jump to that location. The data is also located at the beginning of the ROM image, and the NULL segment must be first. By default, when the MSC startup code and linker create the NULL segment they initialize the first 16 bytes with zeros. You could suppress these zeros (via a linker switch), but that's probably not a good idea. Having zeros at the beginning of the NULL segment makes a program a bit safer — dereferencing a null pointer will access the NULL segment instead of the data segment, as long as the access is no more than 16 bytes.
I've just given compelling reasons why both the NULL segment and the startup code should occupy the beginning of the image. Will this work? Luckily, yes. Unlike many processors, the 80x86 does not use a zero byte for a NOP instruction. Instead, two zero bytes form the "add [bx+si], al" instruction, so if the NULL segment is executed as code, it will appear to start with eight of these instructions. Executing these instructions causes no harm, since memory will be initialized afterwards anyway. I can place "real" code right after the eight adds. This code, which I call jumpstart code, is actually located in the NULL segment (I specify its location by enclosing it in a segment . . . ends block — see Listing 3, lines 49 through 100) so that the linker can't possibly separate it from the 16 zeros.
The jumpstart code just needs to accomplish one thing, a far jump to the startup code proper (_TEXT:0). That doesn't sound difficult, but recall that exe2bin will add a base address (in this case, 0x40) to all absolute references! It doesn't know the difference between absolute code references and absolute data references, it just adds! The reference will be wrong at execution time, but I know the base (0x40), so I can subtract it to undo the effect of the incorrect fix-up. This locates the reference with respect to zero, so I must also do the correct fix-up at run time to locate the reference with respect to the beginning of the ROM image. Since self-modifying code is out of the question (morals aside, we are executing in ROM), I use the first doubleword at DS:0 temporarily for a jump vector, and compute the correct address there. Since I'm using the compact memory model, I know that the compiler won't generate any far jumps, and the trouble ends here. From this point on, it's smooth sailing into the startup code, and eventually, into your program.

Have Fun
I've presented a dirt cheap way to develop programs on a PC and embed them in ROM. I hope you don't just read the code — do something! Even if you don't have an embedded project in mind, doing something simple is worth it just for the experience and the education you'll get on the way. So snag yourself some hardware, any hardware, and teach it to say hello to the embedded world.