April 2001/A Handy Debugging Class

Testing and Debuggin

A Handy Debugging Class

Maurice Fox

Tracing, debugging, and timing — all in one package and dynamically controllable with simple, intuitive control strings.

From time to time, most of us need to look inside a running program to see what’s going on, either by tracing function calls or inspecting the values of variables. This article shows a way to do that and more using a flexible C++ Debug class as is or extending it to meet your own needs. Debug provides for tracing, debugging, interval timing, and output control. It provides optional function timing in connection with tracing. You can control the Debug object at run time using control strings, as I explain later in this article. You can use conditional compilation to adjust some of the Debug object’s characteristics or even disable it entirely.

I developed Debug using gcc on OS/2. A number of my team members have used it in AIX and Visual C++ environments. It should work on most platforms that support the Standard C++ library with little or no porting effort. The keyword and control string approach was inspired by a somewhat similar C package written by Fred Fish. This C++ implementation, however, is totally original.

The Interface

Listing 1 shows the Debug class’s public interface. To use the package, #include the Debug header file, and then create a Debug object inside each function that you are interested in. The constructor and destructor for that object take care of tracing entry and exit of the function, timing the function call if you like. You can use one of the overloaded member functions, Debug::Dbg, to display text and the value of a variable. These functions have been overloaded for the built-in types int, const char *, and float. (There is an additional member function Debug::Dbx to display an unsigned long in hexadecimal.) You can use a pair of static functions to time arbitrary intervals without depending on Debug objects.

By default, the Debug class produces no output. To get output, you need to call the static function Debug::specify to pass in a control string. You can call Debug::specify as often as you like for fine-grained control of the class.

Debugging Features

The debugging package provides three major functions: tracing, debugging, and interval timing. To use these features in a program, you first add code to relevant portions of your program to construct Debug objects and to call Debug member functions; you then enable or disable the features by passing keywords to the debugging package at run time, in the form of control strings. The following is a description of these major functions.

Tracing

In this article, tracing means printing a line of data upon entering and exiting a function. To get tracing functionality, construct a Debug object at or near entry to the function. Pass the name of the function as an argument string to the Debug object’s constructor. Tracing output will be printed if tracing is enabled and either of two conditions is true: a keyword that matches the function name has been supplied by the user, or the collection of keywords is empty.

Having tracing on causes tracing and debugging output to be indented by two spaces for each level of function call up to a fixed limit, currently 12. Upon execution of the Debug object constructor, a line of data goes to the output, which looks something like this:
=> fred
When the program returns from the function or otherwise exits the scope in which the Debug object was constructed, the object’s destructor runs and inserts a line of data into the output, which looks something like this:
<= fred
Thus, no special effort is needed for exit tracing.

Function timing is an optional capability included with tracing. You select function timing at compile time by choosing the correct constructor for creating the Debug object at the beginning of the function. If you use the constructor with only one parameter, you don’t get timing. If you use the one with two parameters, you get timing. The second parameter is a bool, which you can think of as “raw timing.” If its value is true, you get the time since the Epoch (or whatever your OS uses) upon entry and the elapsed time of the function on exit. This time is purportedly in seconds and milliseconds, but its precision depends upon the clock tick rate on your system. If the second parameter to the constructor is false, you get user-friendly time in the style of asctime, which is what the Debug class uses.

For example, if you write Debug D("fred"); you get:
=> fred

<= fred
If you write Debug D("fred", true); you get:
=> fred 901238947.234

<= fred         0.877
If you write Debug D("fred", false); you get:
=> fred Sat May 13 15:39:23 2000

<= fred Sat May 13 15:39:24 2000
Debugging

Debugging in this context simply means printing a line of data to the output. (As opposed to “tracing,” which happens on function entry and exit, debugging causes immediate output in response to a call to one of the Dbg or Dbx functions.) To get debugging output, use an overloaded Dbg member function or the Dbx member function of a previously constructed Debug object. See the function prototypes in Listing 1 for the available choices. In Dbx and each variant of Dbg, the first argument is treated as the keyword that controls debugging. The line that is printed contains indentation (if tracing is in effect); the keyword is followed by a colon and then the requested output. The line is printed if debugging is enabled and either of two conditions is true: a keyword that matches the keyword passed as an argument has been supplied by the user, or the collection of keywords is empty.

The package provides a small set of generally useful Dbg functions. If these don’t meet your needs, you can use the given functions as a pattern to write your own. Another alternative is to use a single output stream, as described below.

Interval Timing

Interval timing allows you to measure the elapsed time between any two points in the program. You provide for interval timing by placing a call to Debug::TimeStart at the start of the interval and a call to Debug::TimeEnd at the end of the interval. These are static functions, which means that you need not create a Debug object to invoke them. (You may invoke them through a Debug object if you like, however.) It also means that the start and end points of the time interval(s) being measured need not be in the same function. Interval timing is controlled by keywords in the same fashion as tracing or debugging. Each keyword is timed independently, so the intervals being timed can overlap.

You can use one TimeStart call to mark a single point and then a number of TimeEnd calls with the same keyword to get a series of elapsed times from that single start point. If you call TimeEnd without a preceding call to TimeStart with the same keyword, you will get the same result as if you had called TimeStart immediately followed by TimeEnd. If you call TimeStart for the same keyword several times without intervening calls to TimeEnd, the package uses the last one called as the start of the timing interval. This won’t break anything and might even be useful.

Control Strings

Control strings direct the actions of the package. A control string is a sequence of non-blank characters of the form [Anything]Option[Option*]. (Here, text enclosed in square braces represents an optional expression. An asterisk following an expression represents zero or more occurrences of the expression.) Option is a sequence of the form X:[keyword_list]. X represents a single non-blank character other than a comma or colon. keyword_list represents an optional sequence of “words” separated by commas. Each “word” is a sequence of non-blank characters other than a comma or colon. The first option can be preceded by Anything, a sequence of non-blank characters not containing a colon, which the package ignores. You will see the reason for allowing a string of Anything under the section, Using the Debug Class.

The following is a syntactically valid control string:
D:fred,wilma,logT:O:debug.out
So is this one:
DEBUGD:fred,wilma,logT:O:debug.out
In the first example, the options are D:, T:, and O:. Three keywords, fred, wilma, and log, follow the option D:; no keywords follow the option T:; and one keyword, debug.out, follows O:. In the second example, the arbitrary string DEBUG precedes the first option, which is again D:.

All options, except for O:, come in pairs. The capital letter turns something on and remembers any keywords supplied. The corresponding lowercase letter turns the same thing off and forgets any remembered keywords. The next few paragraphs explain the options that the package recognizes.

Control String Options

T: Enables tracing. The package saves the keywords that follow this option and matches them against the function name provided to a Debug object constructor. If the package finds a match, it traces the named function at entry and exit. (This assumes that you have already modified the named function to create such a Debug object within its body.) If no keywords are in effect, the package traces all instrumented functions at entry and exit. If you give this option with keywords several times, the keywords you specified are added to the list of keywords recognized. You can specify the same keyword several times; that doesn’t accomplish anything special, but it doesn’t hurt anything, either.

t: Disables tracing. This turns tracing off and clears the remembered keywords. Any keywords that follow this option are ignored, but cause no error. For example, you could type the control string T:main,f1 at one point in the program. Only the functions main and f1 would be traced. Later on, if you type the string t:T:, it will first have the effect of disabling tracing and clearing the keywords and then enabling tracing for all functions. All other options, except O:, work the same way.

D: Enables debugging. The program saves keywords that follow the option and matches them against the keyword provided to each call to the member function Dbg (or Dbx). If it finds a match, it prints a line to the output. If no keywords are in effect, all calls to Dbg (or Dbx) result in lines being printed. As with T:, entering this option several times with keywords causes the keywords to accumulate.

d: Disables debugging. This turns debugging off and clears the list of remembered keywords. The same logic described above for t: applies, only to debugging.

M: Enables interval timing. The program saves keywords that follow the option and matches them against the keyword provided to each call to the static functions TimeStart and TimeEnd. If the program finds a match, it prints a line to the output. The function TimeStart starts timing the interval associated with the keyword and prints the raw time. The function TimeEnd stops timing the interval associated with the keyword and prints the elapsed time since the last call to TimeStart for that keyword. If no keywords are in effect, all calls to TimeStart and TimeEnd result in lines being printed. As with T:, giving this option several times with keywords causes the keywords to accumulate. Note that TimeStart and TimeEnd are static functions. This means that they do not depend upon a Debug object, so you can use them to time arbitrary operations, even between parent and child processes in a Unix-like environment.

m: Disables interval timing. This turns interval timing off and clears the remembered keywords. The same logic described above for t: applies, only to interval timing.

O: Output control. Provide a keyword after the O: to select the output destination, as follows: - (single dash) selects standard output (cout); -- (double dash) selects buffered standard error (clog). Anything else becomes the filename for an open call. The package opens the file for appending, so you can switch from one file to another and back again without losing data. This capability also allows you to do such things as limit the size of log files. If you specify several keywords (useless but harmless), only the last one seen has any effect.

It is not necessary to supply an O: option to get output. By default, the package is completely inactive, but its output is preset to clog. If you just choose D:, T:, or M:, you will get output to clog.

Using the Debug Class

Making Debug work for you comes down to three things: organizing your view of your program, coding useful function calls, and passing in control strings to make output appear.

Organizing your view of the program consists of selecting the keywords you will use when invoking constructors and other member functions of class Debug. For a Debug constructor, I routinely use the name of the function it’s used in; you may find reason to do otherwise. For debugging, you could use the same keyword everywhere, but it’s better to use a number of keywords, one for each major topic of the program. That way, when you get one aspect of the program under control, you can use the keywords in the control string to see only the parts you still need. The package prints the keyword on each line of output — this makes it easy to use tools like grep to sift through the output for specific information. For interval timing, you can use multiple keywords, or possibly just a single keyword to establish a start time. You could then use a text string in the Debug::TimeEnd call to tell the different places apart.

The package gets its control strings through the static function Debug::specify, which takes as its argument a C-style string pointer. You can set up your test program to read the first control string from the command line when the program starts, by scanning the argv[] parameter to main. Listing 2 shows a couple of examples of this approach.

Considering the first example, while in the development phase, you could run the program by typing something like -DEBUGD:T: on the command line, which would trace and debug everything to clog. A line like -DEBUGT:D:A,B,CO:debug.out would trace everything; debug only those lines with keywords A, B, or C; and put the data into the file debug.out. If you don’t provide a control string on the command line, the program will produce no debugging output.

Another possibility is to use the package for logging. The second example shows a way to do that by using a default control string that will be in effect for production, but enables you to override it from the command line for development.

In a situation where getting data from the command line is not feasible, such as a GUI project, you can use an environment variable to pass in a control string.

Flushing Output

In the interest of performance, the package does not flush its output after each line. It uses “\n” at the end of each output statement instead of endl. The static function Sync allows you to flush the output stream at will. This can be useful when several processes are writing to the same file, as can happen when a process forks a child in a Unix environment. In a Unix environment, you may want to do something like this:
D.Sync();
pid = fork();
or
Debug::Sync();
pid = fork();
Compile-Time Options

Multiple Keywords

This option allows you to specify several keywords, separated by commas, (no spaces!) for a Dbg, Dbx, TimeStart, or TimeEnd function call. Then, if any one of the keywords is in effect (or no keywords, as usual), you get the requested effect. This option could have some performance impact if you use a lot of keywords, so you select it at compile time, not run time. Define the symbol DEBUG_MULT_KWDS when compiling Debug.cpp to get this behavior. If you use this capability you can write something like this:
D.Dbg("fork,cache,file","Value of x is",x);
and get debugging output when any of the keywords fork, cache, or file has been specified, or when no keyword has been specified for debugging.

If you write such a line when using the version that does not recognize multiple keywords, you won’t get any output if any keyword is specified; The code that parses control strings will not create a keyword with embedded commas, so that “keyword” will never be matched.

Using Windows/DOS Drive Names

If you want to use Debug in a DOS-like environment and need to include the drive name as part of the output file path, you have a problem because the colon used to identify options clashes with the colon in the drive name. To make a version that uses a semicolon to identify options instead of a colon, compile with the symbol DEBUG_DOS_FORMAT defined. All of the foregoing documentation still applies, except that where it reads colon, mentally substitute semicolon. Then you can use a control string like:
D;T;O;D:\test\debug.out
and get the output to the DOS file D:\test\debug.out. This would not work in a Unix environment, where semicolons separate commands.

Using a Single Output Stream

You can send all output through the same stream if you wish. To do this, compile with the symbol DEBUG_1STREAM defined. Internally, this option works by logically sending all output to the clog stream, changing the actual destination by using the rdbuf member function of the basic_ios class.

If you use this option, any clog calls made directly within your program will go wherever you have sent the output. Doing so allows you to put arbitrary data into the debugging output stream without writing your own Dbg calls. You can write output to clog with a line like this:
clog << "Checkpoint " << n << " reached, X is now " <<
     x << endl;
and it will go wherever you have selected using the O: option. This could be simpler than writing your own overloaded Dbg function.

Because output goes through clog, you can write something else to cerr, and it will go to unbuffered standard error, not to the debugging output. Of course, if you have chosen to have output go to clog, they will both go to the same destination. If you do not use the single output stream feature, whatever you write to clog or cerr will go to the same destination, intermingled with debugging output only if you have specified O:--.

Compiling the Package Away

Normally, the presence of the debug code will have negligible effect on the performance of the application. If you really need to eliminate debugging in production code, you can compile a new “no-op” version by defining the symbol DEBUG_OFF both for Debug.cpp and your own code. Usually you’re just kidding yourself when you do that, though.

Test/Demonstration Program

You can get a standalone test/demo program by compiling Debug.cpp (to download, see www.cuj.com/code/) all alone with the symbol TESTING defined. Feed it control strings on the command line and watch what happens.

Summary

This article has described the Debug class, a useful tool that you can use directly or extend as you need. It provides run-time control of its actions, to as fine-grained a level as you wish. Through conditional compilation, you can change several fundamental aspects of its behavior. As sample code, the Debug class demonstrates encapsulation, static class members, and several components of the Standard C++ library. I hope you find it as useful as my teammates and I have.

Maurice Fox is a software development manager for a large system integrator. He has been programming for longer than he cares to disclose, using C since 1984, and using C++ for several years. He can be reached at mauricef@ix.netcom.com.