Debugging


Debugging with Exceptions

Alessandro Vesely


A. Vesely has a laurea in General Mathematics and has been programming for the last 12 years. He produces software on commission, does data processing, and consults in Information Systems. He may be reached via e-mail at mc6192@mclink.it or via surface mail at via Anelli 13, 20122 Milano, Italy.

Introduction

I briefly review the most common approaches to debugging, to point out they are rather inadequate for exception handling. If you decide to handle exceptions, you may well be disappointed when debugging your application. I would rather encourage people to use exception handling, and I believe this feature will get better, but a warning is appropriate.

For the pioneering use of exceptions that is possible today, I describe a common approach to debugging. If you haven't used exceptions yet, knowing the typical debugging problems may affect how you design your application. (I'm talking about regular applications, not featuring special fault tolerance, nor even smart recovery from exceptions.) Typically, the need to handle exceptions comes from using class libraries that throw them. Not handling exceptions, which is the default, results in aborted execution. A simple thing one wants to do instead is to die gracefully. In such cases, the most relevant design issues about exception handling are related to debugging.

A naive approach to building a debug version is to insert printf/getchar pairs in difficult sections of the code. Some programmers throw away that code after they are done with it. Some just comment it out, having learned there is no shame in making code easier to understand. Sound familiar? Since difficult-to-write sections are likely to require later maintenance, sooner or later, the common approach is to delimit the debugging statements with directives like:

#if !defined( NDEBUG)
#endif
When the amount of output grows too large, you then define a flag for conditionally outputting the relevant data. A plethora of macros have been provided for structuring debug output. Beginners often don't use them, but a typical macro for them is exemplified in
Listing 1.

The other relevant debugging macro is assert. As far as tying the code back to the design is concerned, assertions are more fun to write than, say, headers. You can use them to state preconditions and postconditions for each function. Many programmers adopt the habit of systematically writing assertions, even if they are not syntactically required. It is the programmer's responsibility to ensure that asserted expressions have no side effects.

Let me also mention debugging memory allocation functions. One approach is to classify and enumerate all memory allocation calls. You can take snapshots of the memory pool and compare them with similar snapshots you took previously, getting all objects added since. For each object, you can dump its class, sequential allocation number, and possibly its contents and the source line from which it was allocated. In the debugger, you can set the sequential number of the memory allocation at which you would like to break. Also, newly allocated memory is pre-filled with fixed non-zero values, in order to minimize randomness.

Debugging Tools

It is a pleasure to write applications with modern tools. In addition to browsing capabilities, syntax colored editing sessions, and other productivity oriented functions, Integrated Development Environments (IDEs) provide integrated debugging and relieve developers from maintaining make files. The latter feature leads naturally to building "debug versions" of applications, which is a major aid in testing.

You may step into a debugger to learn what the compiler did with your source code. It is more practical than reading the assembly language listing. Further, the debugger is better than any browser in reading source code following the control flow. This helps you understand what happens at run time and lets you anticipate many potential bugs. And the debugger can also be used to actually catch bugs.

How do you do that? There is no scientifically comprehensive approach. Some use test tools that simulate predetermined input. But even then, test input is not being determined automatically and seldom includes "machine input," such as the amount of free resources. More commonly, you tinker with the program, using all allowed commands and options. At a very minimum, you must try the main commands and check that they behave as intended.

Normally, when an assertion fails the program aborts. Likewise, the program aborts when an exception is not caught, and in a few other odd places, such as when a destructor called during stack unwinding issues in turn an exception. In general, the program terminates when an exception is thrown and the exception handling mechanism finds its memory structures corrupted. Unlike assertions, this feature is not removed when you build the retail version. You may want to control your app behavior in this area, by using set_terminate and set_unexpected.

Debuggers are much more handy when they stop at a failed assertion, as they let you examine and repair the relevant memory structures, then continue. In this respect, assertions can be viewed as logical breakpoints that one is free to spread throughout the code. So, it is a good idea to evaluate the expression assert(false) in your terminate function. New debuggers provide you with the ability to conditionally break execution when exceptions are thrown. This feature may seem strange: bad parameters and similar misunderstood usage of functions should be trapped by assertions. So why would you want to break when an exception is thrown?

One answer is that there is a twilight zone in the boundary between what can be trapped by assertions and what requires an exception to be thrown. It covers resources that may get corrupted, such as files, databases, and other executables your app interacts with. As an example, consider calling a polymorphic function with a parameter coming from a corrupted file. In early debugging stages you don't yet fully understand the file format. You would like to learn how to prepare the test file rather than test what happens when the file gets corrupted. If you don't use exceptions, you may assert the file is not corrupted, to ease debugging. You must then test what happens when the assertion fails and execution continues.

More generally, Stroustrup [1] has proposed a templatized inline function Assert that is compiled in both debug and retail builds. It is to be called from const member functions to throw an exception if any class invariant is corrupted. Regular functions would call the invariant-checking function of each passed parameter of non-prime type, instead of directly asserting invariants.

Debugging Exceptions

Exceptions are an evolution of the setjmp/longjmp function pair of Standard C, which has been on the scene for much longer. A structured version of these functions can be found in the older versions of MFC (Microsoft Foundation Classes 1.x and 2.x). It is a set of macros to be used with a try/catch-like syntax. (In MFC 2.x, TRY expands to

{ jmp_buf x; if(setjmp(x)==0){
and CATCH to

}else if(rtti_check(y)){
sorts of constructs. An END_CATCH macro closes the block.) Stack unwinding is not performed when you do a structured longjmp, even though it is an integral part of throwing an exception.

Using structured longjmp, you are better off handling exceptions in each function to be sure to destroy objects. In this respect, it is easier to use local pointers to objects built on the heap rather than to declare them on the stack, as pointers can be easily deleted and reset to zero. By contrast, explicitly calling an object destructor may require you to maintain a flag to avoid destroying the same object twice. Why? Because even though you've destroyed the object, you cannot prevent the compiler from calling the destructor again.

To handle an object that requires before/use/after semantics (e.g. as file open/close, memory alloc/free), code every simple function

void f()
{
   class_w_dtor obj;
   // use obj
}
with a structured longjmp:

void f()
{
   class_w_dtor *pobj = 0;
   TRY
   {
      pobj = new class_ w_dtor;
      // use *pobj
   }
   CATCH_ALL(e)
   {
      delete pobj;
      THROW_LAST();
   }
   END_CATCH_ALL
   delete pobj;
}
When using exception handling, by contrast, you need additional clean up logic only if you have local pointers that need to be deleted. [See "Checked Pointers for C++" by Robert Mashlan, elsewhere in this issue. — pjp] Hence, it is easier to build objects on the stack. That way most functions don't need to handle exceptions (at least, not for cleanup purposes).

If you use exceptions, you must try the main paths through your code and check that they behave as intended when an exception is thrown. Hence, you must provide some means to actually throw exceptions. In principle, you would need to fill all memory and hard disks, corrupt all files, busy out all telephone lines, uninstall executables, et cetera, before you start each debugging session. However, many tests don't really require the condition that triggers the exception to be met. It suffices to throw an exception.

A practical approach is to use a macro such as EXCEPTIONS_CAN_BE_THROWN, which may be defined as in Listing 2. I started to use the TRY and CATCH macros as they are provided by the MFC library. In debug builds, I modify them in order to provide that exceptions can be thrown. The idea is to prepare a list of throw points and then test how a given command fails in each case.

For example, you might run a command without forcing exceptions the first time. The command should succeed, and the macros should help you prepare the throw points list for that command. Next you test the same command with an exception being thrown from each throw point in turn. (I provide a bare bones implementation of this in ThrowPointTest(), on the code disk.) See [2] for a more complete example, featuring a debugger-like dialog that allows you to choose which exception to throw from where. Schorr considers all the exception types defined in MFC. A general tool should use the exceptions listed in the exception specification interface of each function.

If your design is conceived to ease debugging, throwing an exception should ensure that the throw points coming later in the list are not reached. If you don't meet this requirement, you should consider forcing exceptions to be thrown for each possible combination of the throw points in the list. (The number of combinations is roughly 2 to the number of throw points.) You may need to catch and re-throw an exception in order to do additional clean up or add diagnostic information. Re-throw points don't show in the list. Handling should be as straightforward as possible, as it really is a nuisance to have bugs in an exception handler.

In general, the consequence of a throw is that the command fails. The only sensible thing I find to do is to ensure that each exception be reported to the user at the point where the most relevant information is available. It may seem clever to just "eat" an exception so as not to disrupt the main flow of control with the failure of a minor, non-vital task. In my opinion, that's not very smart. The run-time overhead involved in throwing an exception is often large, so a task that repeatedly fails, though requiring no user intervention, will degrade performance. Also, if you accept the idea that debugging troubles grow exponentially with every ignored exception, eating exceptions is just a recipe for creating slow applications with random behavior.

If you want to be smarter and implement possible recovery actions, I suggest that you first restore the normal flow of control and return a failure code from the relevant command function. As a consequence of getting the failure code you may call any recovery function. Recovery functions should be self-contained in order to be tested separately.

How do you know that everything is okay after an exception has been handled? The answer depends on what is everything and what is okay. For a command-line utility, everything is the file system and okay is to abort safely, possibly leaving a message in some log file. For many interactive application, everything also includes the heap, resources, the screen layout, and the settings of a number of memory variables that may condition further execution. Okay means that relevant variables are as they were before issuing the failed command. Most probably, typical users will repeat the same command with more attention, causing it to fail in the same way, until they believe that it is not their fault.

You shouldn't have memory leaks in between retries, so it is a good idea to automate memory-pool checking. If you consistently allocate heap objects for each resource, checking the memory pool is more significant. In any case, the state of the objects that existed before the exception may also be relevant and should be checked, either automatically or manually in the debugger.

What stands out in such a debugging scenario is the need for a global Exception Monitor. In the simple case, a static Boolean that is set on throw and cleared after handling should suffice. (See class CExceptionMonitor in Listing 2 for an example exception monitor.) The exception monitor is notified via the macros used to throw and catch exceptions. What about exceptions thrown from the linked code? If I exclude playing non-portable tricks and/or rebuilding the library, I miss them. In this case the exception monitor only makes sense in the debug build, when most exceptions are thrown using the macros.

On the other hand, knowing whether the stack is being unwound is handy even outside of debugging. Destructors that throw exceptions are likely to benefit from such information. Consider writing to a file. If you get an error when you attempt to close the file, it may be that the file is not properly written. It is fair to throw an exception in order to let the user know. On the other hand, you typically close the file from the file's destructor. You are aware that an exception can be thrown, but you may be unable to handle it in the Close member function. Possibly, you don't know the file name or its purpose, both of which are typically needed to display a suitable message to the user. Things work all right until you destroy a file as a consequence of stack unwinding caused by another exception. In the latter case you should call the Abort member function. But how do you know that another exception has been thrown? An Exception Monitor might not be such a bad idea.

Conclusion

To force an exception to be thrown is a common debugging approach that allows comprehensive testing. However, the exceptions thrown are only simulated. Hence, on the one hand, the exception was thrown as the result of a tester decision and not because something really happened. If you write to a log file on a disk-full exception, you may be surprised by the real-life behavior. Finer tests have to be carried out by causing real resource consumption. On the other hand, if you find it difficult to debug exception handling, you're not alone. Be prepared for possible surprises when the exception is thrown from the middle of your library's code rather than from the end of the try blocks. You are only responsible for your code, but coping with library and compiler bugs and with draft C++ Standard inconsistencies may sometimes be necessary.

I have noticed that many debugging practices are essential in providing the ability to build complex applications that work (almost) correctly. However, most of those practices depend on the compiler and/or the libraries one uses and, hence, are intrinsically non-portable. Exception handling is in the draft C++ Standard and has a twofold relation with such practices: it demands more care to be debugged itself, but on the other hand it can be used to standardize debugging practices.

In conclusion, if you use exception handling in developing your application, I urge you to reconsider your ideas about debug builds. Adopt the macros that suit your needs, but don't write too much code. The next version of your compiler may implement some of the functionality that seems the most urgent.

Fault-tolerant applications, a major reason for introducing exceptions into the language, will be easier to develop when debugging support has its place in the draft C++ Standard and the exception handling machinery itself has been debugged. To get there, we first have to use exception handling in "normal" applications.

References

[1] B. Stroustrup. The Design and Evolution of C++ (Addison Wesley, 1995).

[2] Karl Schorr. "An Exception Debugger for Visual C++ v1.5," Windows/DOS Developers Journal June: 1995.