June 1999/Debugging Embedded Systems

Testing and Debugging

Debugging Embedded Systems

Daniel J. Wisehart

Made it through debugging 101? Then you still need debugging 102 if you're new to embedded system programming.

Even a well-seasoned C/C++ developer who has considerable experience with debugging desktop applications can have trouble debugging embedded applications. Embedded application debugging is often done through a remote debugger and without a console window. Having no console window means that you need an alternate way to get feedback on the execution (or failure) of your programs. Remote debugging is difficult because you are dealing with multiple machines simultaneously, and if the embedded system hangs or reboots you are cut off. The embedded applications themselves are stored in ROM and often execute on low-MIPS processors. These conditions require specialized debugging techniques, a few of which I present in this article. These techniques do not require special hardware (though that can be helpful), but instead use software that can be compiled and run with freely available tools.

Understanding Embedded Memory

To debug an embedded system, you must know where your program and data reside. C/C++ compilers split a program and data into TEXT, DATA, and BSS segments, then the linker locates these segments into RAM or ROM memory as appropriate (see Figure 1). The TEXT segment contains executable code, and by the time your product ships, this code will be in ROM. The DATA segment contains uninitialized data, which means that it must be placed in RAM so that your program can initialize its value. The BSS segment contains initialized data and it is placed in ROM or RAM depending on how it is declared. Constant BSS data can be put into ROM because its value is set at compile time and should not change. (Be forewarned: if you try to modify a value stored in ROM, the embedded system may crash or reboot itself. Before casting away the constness of a static variable to enable changing its value, be sure that variable isn't stored in ROM.) Non-const BSS data resides in RAM, along with the memory a program allocates at run time on the stack or the heap.

Program and data segments may be moved from their default segments by compiler optimization or linker directives, so these are not hard and fast rules. Wherever they end up, you can always locate program or data segments with your debugger or a tool that lists object files, such as gnm (the GNU object file figure program). gnm will list the symbols and their associated addresses from an object file or library, but you must ensure that the run-time linking process does not relocate these symbols and change their addresses. The debugger does not suffer this limitation because it returns the run-time location of a symbol, so it is my preferred tool when I have the option. Ask the debugger to print the address of a function or variable to find out where it is stored. You can then compare that address to a hardware memory map — the address will tell you the type of memory in which it is stored.

Debugging ROM-Based Code

Embedded systems contain some code in ROM, which creates certain problems for the debugger.

On a desktop machine whose running program resides completely in RAM, the debugger is free to modify executable code in a convenient manner to support the breakpoints you set. When you set a breakpoint, the debugger replaces the code the compiler created from your source with code that transfers control to the debugger at that point. When your program encounters the breakpoint, it executes the code the debugger installed, effectively transferring control from your program to the debugger. When you continue from a breakpoint the debugger transfers control back to your code again, starting at the point where your code left off. Your code then executes again until it encounters the next breakpoint and the process repeats. In ROM-based code, however, the debugger has no way to insert breakpoints because it cannot modify that code. If you are going to set breakpoints in ROM-based code, you must give the debugger a handle to something it can modify, i.e., code that is in RAM. (Some processors can generate an interrupt when a specified address is on the address bus, but I assume in this article that you do not have such a processor.)

Suppose you want to breakpoint a ROM function because you suspect that a precondition of the function is not being met. (A task that exits because of an assert is not much help in an embedded system without a console window.) Alternatively, suppose you have an error handler in ROM that is being called though it never should be, and you want to find out what code is causing the error. Your options are as follows.

While the debugger cannot change your ROM-based code, it can breakpoint code that is in RAM. So if you call a debugging function in RAM from a function in ROM, and set a breakpoint in the RAM-based code, the debugger will be able to support the breakpoint (see Figure 2). The RAM-based code does not require much functionality: all you need is something the debugger can modify and therefore breakpoint. If you design your RAM-based code carefully, a single function can serve as a general-purpose debugging function. After you have made the necessary changes to the function under test stored in ROM, check your linker manual for ways to dynamically link the debugging function in RAM with code based in ROM.

Your other alternative is to compile your debugging function into the ROM image, but have the linker move the executable code into the DATA segment in RAM. (The GNU linker gld will do this, but do not try it on a Harvard Architecture machine.) During the C/C++ initialization on the embedded system, the debugging function will be copied from ROM into RAM as part of the DATA initialization process. Now that the debugging code is back in RAM, you can breakpoint it at will. (It is also possible to move your function under test into RAM and skip any debugging function, but RAM and ROM performance can be widely different, so this may be unworkable.) Once in place, this technique will save you the effort of manually loading and linking the RAM-based debugging code every time you power cycle the embedded system. The example code in Figure 2 carries very little overhead because numeric compares are fast, and you set the conditional to true only when you are interested in trapping a particular event. Once the conditional evaluates true and a breakpoint is set in the RAM-based function, the debugger will stop the program the next time through the ROM-based function.

Although you have to take extra steps to get here, this is what you want from a breakpoint: the ability to stop task execution and inspect memory on demand under conditions that you specify. However, once your task is suspended, do not try to step from RAM back into your ROM-based code because the debugger cannot control the program's execution once the program counter points into ROM. To see intermediate points in the ROM function you will have to create additional RAM diversions.

Depending on your compiler and debugger, you may have to take one additional step to debug ROM-based code. The debugger may try to catch errors and exceptions by setting additional breakpoints in compiler-supplied error handlers. When these error handlers end up in ROM, the debugger will complain endlessly or crash the system when it tries to modify them. In my vendor-supplied library built with gcc (the GNU C/C++ compiler) and gld, that code is in an object file named longjmp.o. I extract the error handler from the library and link it in with my RAM-based modules. My RAM-based modules then reference the RAM-based error handler, which makes the debugger happy because it is able to set breakpoints in the handler.

Debugging Intermittent Failures

Debugging intermittent failures in desktop applications is difficult. Debugging them in embedded applications is even more difficult because you have less information about them and less control over their execution. To find and fix an intermittent bug you must begin with what you already know.

If a bug shows up in a particular function, you can use that function as a starting point even if it is an effect and not the cause of the bug. First, create a breakpoint and spend several debug cycles finding the particular conditions that produce the error. Once you can delimit the error-causing cases, change the conditional to trigger on only those conditions (see Figure 3). Continue refining the cases you trap until you end up with half a dozen functions that are consistently on the call stack just before an error is detected. These functions become the next points at which to insert breakpoints and work your way back to the real cause of the error. The failing function may not be on the call stack, but seeing the sequence of events just before the detection of the error will give you clues to what is causing the error.

The breakpoint flag is included in this example also. Before the breakpoint function is called, this flag is checked in an if statement. This enables debugging to be turned off while keeping the debugging code in place with a minimal performance hit. Except for the extra conditional filtering, this type of breakpoint works the same as the earlier example.

Debugging without Breakpoints

For certain types of errors, the call stack leading up to error detection will indicate the source of the bug. But for other types of errors, the call stack may tell you nothing useful. In these cases, you need an alternate method: a way to check the progress of earlier processing and to report errors before they cause problems. This is often the case for multithreaded programs, which do not have a set calling order for the overall program.

One way to handle this situation is with a separate task dedicated to finding errors. When execution passes through a function of interest, the state of that function is recorded in RAM. The state is subsequently scanned with the concurrently executing error-finding task. This technique is especially useful when debugging code that cannot be stopped; e.g., communications drivers also used by the debugger, time-critical code, interrupts that disable interrupts and task switches, and most supervisory mode functions. Instead of breakpointing the function you store the input arguments, intermediate results, stack trace — any data an error detector might be able to check — so long as the storage requirements do not exceed the limited capabilities of your embedded system.

If I am debugging a troublesome Ethernet frame processing function, I set up a circular buffer to hold copies of the incoming Ethernet frames (see Figure 4). If I have plenty of time to process the frames (as I assume in this example), I store each entire frame. Storing everything is easy to code and it ensures that whatever I might want to monitor later is available. If, on the other hand, the frames are coming at a furious pace, I keep only limited information — just enough to determine when an error is occurring, and perhaps as little as a frame count. I put the buffer and the counters into global
memory so I can access them from multiple tasks and from the debugger. As I narrow the type of frames relevant to my bug, I change the filter to catch fewer frames but with a higher rate of error detection. Not only does this help track and find the problem, I can also use the resulting filter to set breakpoints as previously discussed. What started then as a method of debugging without breakpoints created an opportunity to trap the failure closer to the source.

As I mentioned, this technique of storing states in RAM is often useful in debugging multithreaded applications. In those situations, my stored state also includes the process or thread ID that made the call. This will show if the thread calling order is as I expected it. If I need to know the order across multiple functions, I add the time in ticks to my captured state. All sorts of system states can be kept in RAM; use whatever helps you understand the big picture in which a failure (especially an intermittent failure) is occurring.

Debugging Interrupts

Embedded systems are often heavy users of interrupts. Interrupts are fast, lightweight functions that operate asynchronously in borrowed memory space, making them notoriously difficult to debug. Before you debug an interrupt, there are several differences between interrupt code and regular code that you must be aware of. Failure to account for these differences in your debugging will cause completely new failures in addition to the bug you are trying to locate.

When an interrupt is generated, the operating system usually places the stack frame for the interrupt on top of the stack frame of the interrupted task. If you have an operating system that gives interrupts their own stack, you probably have higher interrupt latency, but you can ignore the following warning. Because interrupt code operates in borrowed stack space, it must not use a lot of it (see Figure 5).

There are any number of system functions you cannot call from an interrupt. (sprintf is usually one of them.) The list of functions unusable from an interrupt will be in your OS manual. They are forbidden because they block or slow interrupt processing too much, or because they generate additional interrupts that cannot be processed, or they are not reentrant.

Calling functions that disable interrupts is always worrisome because other time-critical tasks such as processing network traffic, servicing the watchdog interrupt, and updating the outputs of the embedded system are held off until interrupt processing is resumed. You must tread lightly because it is easy to crash a system while debugging its interrupts. Hence, debugging without breakpoints is often my first choice because it does little except add execution time to the interrupt.

An example of debugging interrupts without breakpoints may be helpful (see Figure 6). One task services the interrupts and wakes the second task. The second task looks for errors in the saved state and publishes any errors it sees. Typically, more data is generated by an interrupt than your debug code can handle, so filter early and often as soon as you can delimit what you are looking for. In this example, I do not save the incoming characters. Not only would the second task be overwhelmed inspecting every character, the interrupt would have slowed excessively to make copies of everything.

Finding Missing Interrupts

One of the more interesting problems to debug is the case of a missing interrupt. This is the problem of an asynchronous event occurring, but apparently no corresponding interrupt being generated. The first thing I do is setup to publish a message whenever an interrupt was missed. I discover that one was missed by checking for too much time between interrupts (for clock-driven events) or a jump in a counter which should increment smoothly with each event (see Figure 7).

I use a busy-wait in the error-finding code to show a simple way to quickly process messages. A busy-wait often works fine for debugging often-occurring interrupts, but it is detrimental to system performance, so expect that it may create other errors. If a busy-wait is a problem, go back to waiting on a mutex.

In many cases when an interrupt is missing, it was actually generated — the reason it's missing is because it wasn't processed soon enough. For example, if a second event occurs before the first interrupt is processed, the second event is lost. If you suspect this problem, try turning off or stripping down all the other interrupts. I might go so far as to replace my interrupt code with code that does nothing but store the state in a buffer and wake regular code that can process the event.

When All Else Fails

It can be highly advantageous to have a logic analyzer (or oscilloscope) to assist you in your embedded debugging. When you are lost beyond the point of meaningful progress. You still will have to write the code that makes the analyzer give you a meaningful result, but a little hardware can save you a lot of wasted time.

The general technique is very simple: use the analyzer as an output device, even if it is only a single bit of output information. The particulars of what you control depend on your embedded hardware, but I often use memory-mapped control lines wired to LEDs. There is rarely any harm or much load created by turning status LEDs off and on. While they will probably flicker too fast for your naked eye to follow, a suitable analyzer can show you every transition. How you toggle an LED depends on your particular hardware, but my example is written for 16 lines of GPIO (General Purpose I/O) that map into a single 16-bit memory location (see Figure 8). I am careful to change only the bit I am interested in, as simple assignment would set (and perhaps change) the state of every GPIO line. When I want to create an output event, I XOR in the bit value for the line I am controlling. Because I use XOR instead of assignment by AND and OR, I can add and delete edges wherever I like within the source and without having to change many other edges. In the example, I create back-to-back edges to indicate the beginning and the end of debugging. These transitions also give me a chance to measure how much time I am stealing by generating an edge. With carefully placed edge generating code, the logic analyzer becomes my window into system execution.

Conclusion

While embedded debugging involves unique techniques, they are techniques that any C/C++ developer can learn and use effectively. Embedded debugging will never be as easy as desktop debugging, but it does not have to be as painful as its reputation suggests.

Daniel Wisehart is a contract programmer currently working for Hewlett-Packard Laboratories in Palo Alto, CA. Besides his computer interests, he is an analog and digital hardware designer whose past projects include people moving transportation systems. He can be reached at dwisehart@yankee.us.com.