February 1994/Standard C

Columns

Standard C

The Header <exception>

P. J. Plauger

P.J. Plauger is senior editor of The C Users Journal. He is convenor of the ISO C standards committee, WG14, and active on the C++ committee, WG21. His latest books are The Standard C Library, and Programming on Purpose (three volumes), all published by Prentice-Hall. You can reach him at pjp@plauger.com.

Introduction
This is the fourth installment in a series on the draft standard being developed for the C++ library. (See "Standard C: Developing the Standard C++ Library," CUJ, October 1993, "Standard C: C++ Library Ground Rules," CUJ, November 1993, and "Standard C: The C Library in C++," CUJ, December 1993.) I am slowly working my way through the entire library, much as I did with the Standard C library in years past. The major difference is that this journey travels far more uncharted territory. So I endeavor to provide consistent landmarks along the way. For example, here, once again, is the overall structure of the draft C++ library standard:
(0) introduction, the ground rules for implementing and using the Standard C++ library
(1) the Standard C library, as amended to meet the special requirements of a C++ environment
(2) language support, those functions called implicitly by expressions or statements you write in a C++ program
(3) iostreams, the extensive collection of classes and functions that provide strongly typed I/O
(4) support classes, classes like string and complex that pop up in some form in every library shipped with a C++ compiler
Thus far I have described (0) introduction and (1) the Standard C library. I have to confess, however, that I'm aiming at a moving target. Both those items have changed in an important regard, thanks to the introduction of namespaces into C++. Namespaces went into the language at the March 1993 meeting in Munich. The library adopted a style for using namespaces at the July 1993 meeting in San Jose. But I won't confuse you by rehashing those topics just to make them current. That would be an endless battle, these days. You can expect changes in C++ every four months for the next year or two. We can only hope that the changes will grow ever smaller as the draft C++ standard converges to final form.
More important, namespaces are so new that no commercial compiler I know of supports them yet. Nor are many likely to do so in the near future. And even when they're more widely available, you can bet that backward compatibility will remain an issue. I expect that namespaces will remain under the hood for some time to come. In the medium to long term, they are most likely to be of interest only to the more sophisticated C++ programmers.
So I will begin, as I intended, with a new topic this month, (2) language support. It contains no small amount of news in the area of exceptions, which have been implemented by several C++ compilers. And it deals with issues of much wider interest, such as the global operator new and operator delete, plus a few newer odds and ends. For this installment, I focus only on the library support for exception handling.
This section of the draft C++ library standard is called language support because it, more than any other part of the C++ library, is on rather intimate terms with the language proper. A compiler will generate code that uses the classes declared in this section, or that implicitly calls functions defined here. In the case of exception classes, you will also find quite a few fellow travelers, not directly connected to language support. But the tie is there, nevertheless.

Exceptions
Exceptions represent a significant departure in C++ from past programming practice in C. Much of what you write in C++ translates one for one to very similar C code. The rest may get longer winded, and a bit harder to read, but it's still conventional C. Exception handling, however, changes the underlying model of how functions get called, automatic storage gets allocated and freed, and control gets passed back to the calling functions.
A compiler can generate reasonably portable C code to handle exceptions, but that code can have serious performance problems — even for programs that don't use exceptions. The very possibility that an exception can occur in a called function changes how you generate code for the caller. Alternatively, a compiler can generate code directly that can't quite be expressed in C — and face a different set of problems. It may be hard to mix such C++ code with that generated from C or another programming language. Perhaps you can see now why C++ vendors have generally been slow to add this important new feature to the language.
What makes exception handling important is that it stylizes a common operation expressible in C only in a rather dirty fashion. You can think of exception handling, in fact, as a disciplined use of the notorious functions setjmp and longjmp, declared in <setjmp.h>. (Strictly speaking, setjmp is a macro, but let's not pursue that distraction for now.)
In a C program, you call setjmp at a point to which you expect to "roll back." The function memorizes enough context to later reestablish the roll-back point, then returns the value zero. A later call to longjmp can occur anywhere within the same function or a function called from that function, however deep in the call stack. By unspecified magic, the call stack gets rolled back and control returns once again from setjmp. The only difference is, this time you can tell from the nonzero return value that a longjmp call initiated the return.
That all adds up to a clever bit of machinery, used to pull off all sorts of error recovery logic over the past couple of decades. The only trouble is, it's too clever by half. Many implementations have trouble determining how to roll back all the automatic storage properly. The C Standard is obligingly vague on the subject, making life easier on the implementors at the expense of being harder on those wishing to write portable and robust code. Nobody pretends that <setjmp.h> is an elegant piece of design.
In C++, matters are much worse. That language prides itself on cradle-to-grave control of objects, particularly nontrivial ones. You are assured that every object gets constructed exactly once, before anybody can peek at its stored values. And you are promised with equal fervor that every object gets destroyed exactly once. Thus, you can allocate and free auxiliary storage for an object with a discipline that ensures no files are left open, or no memory gets lost, in the hurly burly of execution.
longjmp sabotages the best efforts of C++ compilers to maintain this discipline. In rolling back the call stack, the older C function cheerfully bypasses all those implicit calls to destructors strewn about by the C++ compiler. Promises get broken, files remain open, storage on the heap gets lost. The draft C++ standard leaves <setjmp.h> in the library for upward compatibility. But it discourages the use of these heavy handed functions in the neighborhood of "real" C++ code with nontrivial destructors.
Enter exceptions. In modern C++, you don't report a nasty error by calling longjmp to roll back to a point established by setjmp. Instead, you evaluate a throw expression to roll back to a catch clause. The throw expression names an object whose type matches that expected by the catch clause. You can even examine the object to get a hint about what caused the exception. It's kind of like calling a function with a single argument, only you're not always sure where the function actually resides. And the function is further up the call stack instead of one level further down.
Most important of all, none of those destructors get skipped in the process of rolling back the call stack. If that sounds like a nightmare in bookkeeping to you, you're absolutely right. Somehow, the executing code must at all times have a clear notion of what destructors are pending before control can pass out of a given block or a given function. It must also deal with exceptions thrown in constructors and destructors, and exceptions thrown while processing earlier exceptions. Kids, don't try this at home.

The header <exception>
So this fancier machinery is now in the draft C++ standard. All that remains is to decide what to do with it. You can get a few hints from other programming languages. Ada, to name just one, has had exceptions for over a decade. Their very presence changes how you design certain interfaces and how you structure programs that must respond to nasty errors. The one thing we know for sure is that you must develop a style for using exceptions that fits the language as a whole, then use it consistently.
That has serious implications for the Standard C++ library. Traditionally, of course, the library has thrown or caught no exceptions. (There weren't any such critters to throw!) But it's a poor advertisement for this new feature if the library itself makes no use of exceptions. Put more strongly, the Standard C++ library has a moral obligation to set a good example. Many programmers will use only the exceptions defined in the library. Others will model their own on what they see used by the library. Thus, the library is duty bound to set a good example for the children.
Most decisions about the Standard C++ library are made within the Library Working Group (or LWG) of X3J16/WG21, the joint ANSI/ISO committee developing the draft C++ standard. Early on, the LWG committed to using exceptions as part of the error reporting machinery of the library. Not everyone is happy with this decision. Some people object to this decision because they don't want to incur the inevitable overheads of exception handling in every program that touches the library — and that's essentially every program you write in C++. Others object because of the putative difficulties of validating a program that throws exceptions. Some projects require that the software vendors assert that exceptions can never be thrown. So the decision to use exceptions in the library was not lightly made.
Only recently has the LWG agreed on an overall structure. What I present here was approved by the joint committee as recently as November 1993. But aside from a few name changes and other small tweaks, it is likely to survive reasonably unchanged.
All the relevant declarations and class definitions for exception handling can be had by including the header <exception>. (Note the absence of a trailing .h, the hallmark of new C++ headers.) Within this header you can find the definition for class xmsg, the mother of all exceptions thrown by the library. (Yes, the name is horrid — it's very likely to be changed to exception in the near future.) Listing 1 shows at least one way that this class can be spelled out.
The basic idea is that each exception has three null-terminated message strings associated with it:
what — telling what caused the exception
where — telling where it occurred
why — telling why it occurred
Some exceptions may use only the first one or two messages, leaving the later pointers null.
The next important notion is that an exception should have private copies (on the heap, presumably) of all these message strings. A typical exception constructor allocates storage on the heap, copies the strings, and sets the flag alloced. That way, the destructor knows to free the storage once the exception has been processed.
But then why the flag if this is the preferred mode of operation? Well, one important exception derived from this base class is xalloc. It is thrown by operator new when it fails to allocate storage. (More on this in a later installment.) The last thing you want to do is try to copy strings onto the heap when you have to report that there's no more room on the heap! Thus, the special protected constructor that lets you specify no copying of strings. Of course, anyone using this constructor had better provide strings with a sufficiently long lifetime, or trouble ensues. That's why this form is discouraged, except where absolutely necessary.

Throwing an Exception
You'd think then that all you have to do to throw an exception is write something like:

throw xmsg("bad input record");
You can certainly do so, but that is not the preferred method. Instead, for any exception object ex, you're encouraged to call ex.raise(). That function does three things:

First it calls (*ex.handler)(*this) to call the raise handler. The default behavior is to do nothing, but you can hijack any thrown exception by providing your own handler with a call to xmsg::set_raise_handler.

Then it calls the virtual member function do_raise. That permits you to hijack thrown exceptions only for some class derived from xmsg.

Finally it evaluates the expression throw *this.
The first escape hatch is for embedded systems and those projects I indicated above that abhor all exceptions. You can reboot, or longjmp to some recovery point (and to beck with the skipped destructors).
The second is best illustrated by the derived class reraise, shown in Listing 2. It overrides the definition of do_raise in a special way. The override evaluates the expression throw, which "rethrows" an exception currently being processed. It turns out that iostreams has an occasional need to pass an exception up the line to another handler. I invented reraise as a way to do this, but it looks to be generally useful.
The third thing is to do what exception classes were invented to do in the first place. By having all library exceptions be thrown through this machinery, however, the class meets the needs of several constituencies.

Exception Hierarchy
There's still more to library exceptions. Figure 1 shows a whole hierarchy of classes derived from xmsg. Some are defined in other headers, but most are to be found in <exception>. The basic partitioning is into two groups:

logic errors, derived from class xlogic, which report errors that you can, in principle, detect and avoid when writing the program

runtime errors, derived from class xruntime, which report errors that you detect only when you run the program
The former category is for those "can't happen" events that are often too hard to really prevent, at least until after some thorough debugging. The latter is for surprises that happen during program execution, such as running out of heap or encountering bad input from a file.
Listing 3 shows the class xlogic and Listing 4 shows the class xruntime. Note the slight asymmetry. The latter class has an extra low-level constructor, as I described earlier, which is supposed to be used only by xalloc. Two more classes are derived from these to report mathematical errors. Listing 5 shows the class xdomain and Listing 6 shows the class xrange. A domain error occurs when you call a mathematical function with arguments for which its behavior is not defined (such as taking the real square root of -5). A range error occurs when the result of a mathematical function is defined in principle but not representable in practice (such as raising e to the 10,000 power).
Finally, Listing 7 shows the class bad-cast. It is the one exception thrown implicitly by statements generated by the compiler. C++ now includes dynamic casts which, in certain contexts, yield a null pointer if the cast is not permissible at runtime. If the context also requires that a reference be initialized, the executable code throws a badcast exception instead. (A reference can never be to a nonexistant object.)

Terminate and Unexpected Handlers
Exception processing code can also call two additional functions:

terminate, when exception handling must be abandoned for any of several reasons

unexpected, when a function throws an exception that is not listed in its (optional) exception specification.
A terminate condition occurs:

when the exception handling mechanism cannot find a handler for a thrown exception

when the exception handling mechanism finds the execution stack corrupted

when a destructor called during execution stack unwinding caused by an exception tries to transfer control to a calling function by throwing an exception
The default behavior of terminate is to call abort, while the default behavior of unexpected is to call terminated. As usual in C++, however, you can provide your own flavors of these functions. A call to set_terminate lets you specify a pointer to a new function that must still somehow terminate the program. A call to set_unexpected lets you specify a pointer to a new function that can itself throw (or rethrow) an exception or terminate program execution.

Conclusion
As you can see, the facilities provided by <exception> give you considerable latitude in reporting and handling exceptions. The C++ library uses this machinery exclusively, so you can control what the library does with exceptions. You can even prevent the library from actually evaluating any throw expressions.
Given our limited experience to date with using expressions in C++, I'm fairly confident that this is mostly a good design. Time, of course, will tell us better how well we've done.