Features


A Safer setjmp in C++

Philip J. Erdelsky


Philip J. Erdelsky is an R&D engineer with over eight years experience in C programming. He can be reached at Data/Ware Development, Inc., 9449 Carroll Park Dr., San Diego, CA 92121, or on Compuserve at 75746,3411.

Calling setjmp to mark a place in your C program and then calling longjmp to return to it, even from a deeply-nested function call, is considered useful but slightly hazardous. If control has left the function that called setjmp, longjmp will usually crash. Yet the careful use of this perverse pair is an efficient way to transfer control when an error detected in a low-level function must be handled at a much higher level. When using the alternative, passing the error condition laboriously up the chain of functions, testing a flag at each step, you must be careful not to go flying past an intermediate function's garbage collection, leaving memory blocks unreleased, files open, and motors running.

Surprisingly, setjmp and longjmp are still available in C++, and they still work the same way. However, C++ provides a way around the principal danger of setjmp and longjmp.

Technique

The most important part of this technique is to encapsulate the jmp_buf buffer in a class object that has a constructor and a destructor:

#include <setjmp.h>

class error
{
 error *previous;
public:
 static error *current;
 jmp_buf jmpbuf;
 error() { previous = current; current = this; }
 ~error() { current = previous; }
};

error *error::current;
The class has one static member called error::current, that points to the current buffer, the one an error exit will use to make its escape.

If a function needs to handle an error discovered at a lower level, just begin it this way:

function f(...)
 {
   <variable declarations>
   error error_x;
     // calls constructor for error_x
   if (setjmp(error_x.jmpbuf))
   {
      <garbage collection>
      return;
     // calls destructor for error_x
   }
   <function code>
   <garbage collection>
   // destructor for error_x also called here
}
When the buffer error_x is created, its constructor is called. The constructor code links the buffer to a chain of previous error buffers, if any, and makes it the one pointed to by error::current. The if statement puts the context into its jmpbuf member and returns a zero, so processing continues with the function code.

When a lower level function detects an error, it calls

longjmp (error::current->jmpbuf,1);
This brings control back to the garbage-collection routines after the if statement. When the return statement is executed, the destructor for error_x is called, the buffer is removed from the chain, and any subsequent longjmp (error::current->jmpbuf, 1) will use the previous buffer, not this one. Moreover, if control returns from the function f without detecting an error, the destructor for error_x is called automatically. These are the principal safety features of this technique.

Actually, an important part of the garbage collection is performed automatically by C++. The destructors for other variables, if any, are called automatically.

If garbage collection is needed in some intermediate function, you have to be a little more careful. You might put something like this at the head of the intermediate function:

error error_x;
if (setjmp (error_x.jmpbuf))
{
   error *p = error:: current->previous;
   <garbage collection>
   error::~error(error_x);
   <calls on other destructors>
   longjmp (p->jmpbuf,1);
}
You can safely reuse the name error_x because each is local to its own function.

Here is where you run into a little difficulty. C++ is very good about calling destructors for local objects when control leaves their scope, but not when exit is made via longjmp. Therefore, the destructors for error_x and other local variables, if any, must be called explicitly before bailing out.

You can avoid this drudgery if you are willing to sacrifice portability and are using a compiler like Borland C++ 2.0 that uses negative frame addressing. Just use the function declare_error, followed by an immediate return, as shown in Listing 1.

Negative frame addressing makes it possible for a carefully coded (but admittedly nonportable) function like declare_error to find the return address and patch it so that when the following return statement is executed, the destructors for error_x, and other local variables, if any, are called, and then return is made, not to calling function, but to the special function error_exit, which then calls longjmp. Of course, this involves some additional runtime overhead, but error exits are not common and need not be fast.

The function declare_error, followed by an immediate return statement, can also be called when the error is first discovered, if the function from which it is called needs to call destructors for some of its local variables.

Even if you have to do this for every intermediate function, you can still save a lot of code if each intermediate function is called from many places.

Occasionally, an error occurs in a non-function block and needs to be handled in a larger enclosing block. This technique doesn't work so well because there is no practical way for declare_error to find the block exit. However, the old break and goto statements will usually suffice in such cases.

Limitations

This technique isn't foolproof. It can still fail if an error is declared in a function called by a constructor or initializer for a variable declared before error_x:

f(...)
{
  class something old;
  int x = g();
  error error_x;
  if (setjmp (error_x.jmpbuf))
  {
    declare_error();
    return;
  }
If an error is declared in g, control will pass to a higher level, and the destructor for old won't get called.

Don't even think of trying to get around this by declaring error_x first. If an error is declared in g, an attempt will be made to pass control with an uninitialized error_x.jmpbuf.

If memory deallocation is the only garbage collection involved, a more sophisticated memory allocator should be able to straighten things out, but such a technique is beyond the scope of this article.

This is a safer setjmp, about the safest available in C++. For a much safer setjmp, you have to use other languages that may keep you from doing what you really want to do.