Debugging Real-Time Production Software

C/C++ Users Journal June, 2004

Quick for debugging in unfriendly environments

By Bill Pyritz

Bill Pyritz has worked for Lucent Technologies (formerly AT&T Bell Labs) for over 17 years designing and developing real-time telecommunication systems. He can be contacted at pyritzlucent.com.

Debugging real-time software is difficult. Many times, a problem occurs only under special circumstances or under a particular system load. Debugging these types of problems typically requires data to be examined under certain conditions; for instance, 99 percent of the time it works, but I need to look at the value of x and if it's greater than 54, I want to dump the abc structure and take a look at it. Most importantly, you want to do this in a running production system without stopping and/or disrupting the application.

In this article, I present a solution to this problem. This approach is currently being used to add essential field debugging capabilities to a real-time 3G wireless application. In the process, I introduce a debug thread responsible for coordinating the setting, enabling, disabling, and removing of breakpoints in an application process. The breakpoints themselves are written in assembly language and assembled on the production machine. This implies that all the debugging can be done on the target machine, eliminating the need for a special library to be created on a development machine, shipped to the target machine, and installed. This facilitates quick turnaround times for debugging problems, especially in unfriendly environments. The breakpoints themselves cause minimal disruption to the running system and, therefore, can be enabled/disabled while the application is running under load. Although I use a Solaris SPARC target for the example implementation, the general principles can be applied to other architectures as well.

Background

There are many tools available for debugging. Many offer sophisticated GUIs and advanced capabilities for walking through code and setting breakpoints directly from source listings. These tools are wonderful and make debugging much easier, but aren't without problems when using most of these tools with real-time applications:

A better solution is one that allows the setting of low overhead, self-contained breakpoints on running target systems. Most debuggers require the process to be stopped, which is unacceptable for real-time applications. With the approach I present here, the process is not stopped and the overhead is simply whatever you need to do in the breakpoint—you can think of it like inserting code into your application dynamically. In short, latency is minimal. Other benefits include the ability to gather specific data on the target machine where and when the problem occurs without needing to ship special software. Additionally, temporary fixes could be installed using breakpoints in the interim of time between finding the fix and installing the official software update. What follows is a description of one possible means of realizing this type of debugging capability.

Debug Thread

The solution I propose calls for adding a debug thread to each application process. Figure 1 presents a high-level view of this. The debug thread receives commands from the user-debugging process (most likely a command-line program) and carries them out within the context of the application process. The following commands are supported:

When enabled, the debug thread overwrites the instruction at the breakpoint location with a jump instruction to the breakpoint code. The breakpoint code is responsible for duplicating the overwritten instruction(s), thereby facilitating an enter/exit option at the discretion of the user.

The debug thread lies dormant within the process until a command is issued. Commands are sent from users to the debug thread via UDP messages. A database (which could simply be shared memory, the registry, and so on) maintains a mapping from the application process ID to the UDP port. The user process can be designed as simple or complex as necessary, ranging from a message-passing routine to a system-wide breakpoint administrator. In the administrative role, it may coordinate breakpoints within a complex of target systems.

Breakpoints are defined in assembly language and assembled on target machines. This is clearly for expert users only. (Later, I'll discuss some ways to make these tools available to a wider group of users.) Users must disassemble the application code to determine exactly where to set the breakpoint. Most good compilers offer an option that displays the assembly code interleaved with the source code, making it easier to find and understand correct breakpoint locations. With a little practice, you will find it almost pleasurable to engage the program at this level of detail.

A Solaris SPARC Implementation

There are three main components to the Solaris SPARC implementation of the approach I've just described:

The entire prototype code for this implementation is available for download at http://www.cuj.com/code/.

Figure 2 shows the shared memory configuration. As you can see, it is organized into a set of process and breakpoint blocks. This separation allows the same breakpoint to be applied to multiple processes. Each debug thread is assigned a process block and a UDP port for communication. The breakpoint is defined by the function name and offset and has two states: enabled or disabled (see the SharedBreakPointDataInterface in the code package for details).

Figure 3 is a high-level view of the UDP-based communication mechanism. The debug thread sits on a UDP port and waits for commands. The command-line process acquires the correct UDP port of the debug thread by examining the shared memory block. Commands are sent to the debug thread using the UDPClient::write() method. The client then switches roles and becomes a server waiting for a response from the debug thread. The debug thread processes the command and sends a reply to the user process. Using UDP has many advantages including its simplicity and the ability to construct a network of target systems that can be debugged from a central location without requiring nailed-up connections.

Consider this application:

void func1(int y)
{
    if( y != 3 ) 
        printf("y = %d\n", y);
}
int main(int argc, char** argv)
{
    while(1)
    {
        func1(3);
        sleep(2);
    }
}

Now take a look at the disassembly of function main:

main()
   save   %sp, -104, %sp
   st     %i1, [%fp + 72]
   st     %i0, [%fp + 68]
   mov    3, %l0
   call   func1
   or     %l0, %g0, %o0
   ...

Suppose you want to set a breakpoint in the main while loop that modifies the argument passed to func1 such that the printf() executes and outputs the value 75. The following steps are taken:

  1. Determine that you want the breakpoint placed at the call func1 instruction of main. From the disassembly, the function offset is determined to be 16 (that is, main + 16 bytes, where each instruction is 4 bytes). I use breakpoint index 0 since this is the first breakpoint I am setting on the target.
  2. To call func1 from within your breakpoint code, you need to determine its virtual address within the application by sending a Get Symbol Address command to the debug thread. Its address is 0x11700.
  3. Create the file brkpt0.s that contains the breakpoint assembly code:
  4. 	jmptobrkpt0:
    	    ! advance the register window
    	    save    %sp,-112,%sp
    	    ! calling func1() addr 0x11700
    	    sethi   %hi(0x11700), %o0
    	    or      %o0, 0x300, %o0  
    	    jmpl    %o0, %o7 
    	    ! set the single argument to '75'
    	    mov     75,%o0 
    	    ret
    	    restore
    
    
  5. It turns out that the o or output registers are used to carry function call arguments. Further, the jmpl instruction always executes the instruction following it before executing the jump. You can take advantage of these details by executing a mov instruction after the jmpl that effectively changes the integer parameter passed to func1 to 75.
  6. Next, assemble the code on the target and move the resultant object file into the predefined breakpoint directory. In this case, the object file is named brkpt0.o.
  7. You are now ready to set the breakpoint and issue the Set Breakpoint command to the debug thread. In addition to the process ID, I specify breakpoint index 0, function main, and offset 16. The debug thread loads the object file brkpt0.o into the application's address space and calculates and saves in shared memory the address within main where the breakpoint is to be set. You can update the assembly code at any time and reload it into the application. That is, once the breakpoint is set, it isn't locked in the application forever. You can instruct the debug thread to set the same breakpoint as many times as necessary. Each time, the debug thread closes the existing object file and loads the updated version.
  8. Finally, enable the breakpoint by issuing an Enable Breakpoint command to the debug thread. The debug thread overwrites the address in main with a call instruction to the breakpoint code. From this point forward, when the code in main is executed, the breakpoint executes and the value of the argument to func1 is 75 rather than 3 and the printf() executes.

Admittedly, this example is simple and not very practical, but demonstrates the potential of the mechanism. Debugging at the assembly-code level is powerful. When you couple that with a supporting infrastructure as described here, solving problems in real-time environments can be greatly simplified.

Navigating the Code Package

I wrote the code accompanying this article using a test-first approach. Therefore, the best way to understand the code is to start with the tests, which can be found in the file GenericUtilitiesTest.C. You will find acceptance and unit-level tests defined for all the major functionality. Start at the end of the file where main() is defined and work your way through the tests. Hopefully, this gives you a good basis for navigating through the package and understanding the implementation.

The code is organized to take advantage of some code-generation techniques. You will find that the major functional components are defined in <name>Impl.h files. A code generator is run against these files and produces a <name>.h and <name>.C file for each, respectively. These files contain the public interfaces only and provide a loosely coupled interface to the implementation code using redirection. I wrote the code generator (gen.rb in the package) using the Ruby language (http://www.rubycentral.com/).

Future Considerations

What I have been describing is a raw interface that requires expert knowledge of the underlying target architecture. It would be desirable to build an additional layer of abstraction on the described infrastructure that enables a larger group of users to utilize the capabilities. One of the ways to achieve this would be to define a domain-specific language that abstracts the details of the underlying architecture. Users could then specify breakpoints using symbolic names rather than addresses and registers. The aforementioned example breakpoint might be described as follows:

whenenter main line 5
{
    call func1(75)
    goto line 6 // skip func1(3) call
}

This code would be compiled into assembly code similar to what I wrote in this example.

As a start, I have built a simple debugging environment that lets users specify breakpoints and send commands to the debug thread using a simple language that I created. I developed a Ruby script to provide the user interface and to perform all the parsing (Ruby has powerful parsing capabilities). Ruby provides a nice interface to external C/C++ code, which lets me directly access all the C++ interface classes that I developed for communication with the debug thread. The language I developed still requires the user to understand architecture details such as registers and addresses, but would automatically generate the assembly code and produce the object files. The goal is to eventually let users specify breakpoints using symbols and line numbers. You can take a look at the grammar in the code package (the file is named "grammar").

Most of the ideas presented here are still evolving. Take a look at the implementation package at http://www.cuj.com/code and feel free to send me any comments, suggestions, or questions that you might have.