P.J. Plauger is senior editor of The C Users Journal. He is secretary of the ANSI C standards committee, X3J11, and convenor of the ISO C standards committee, WG14. His latest book is Standard C, which he co-authored with Jim Brodie. You can reach him at pjp@plauger. uunet.
Introduction
If I had to identify one part of the C Standard that is uniformly disliked, I would not have to look far. Nobody likes errno or the machinery that it implies. I can't recall anybody defending this approach to error reporting, not in two dozen or more meetings of X3J11. Several alternatives were proposed over the years. At least one faction favored simply discarding errno. Yet it endures.More than that, the C Standard has even added to the existing machinery. The standard header <errno.h> is an invention of the committee. (We wanted to have every function and data object in the library declared in some standard header. We gave errno its own standard header mostly to ghettoize it.) We even added some words in the hope of clarifying a notoriously murky corner of C.
A common topic at meetings of the Numerical C Extensions Group is how to tame errno. Or how to get rid of it. The fact that no clear answer has sprung forth to date should tell you something. There are no easy answers when it comes to reporting and handling errors.
What the Standard Says
The standard header <errno.h> can be one of the easiest parts of the Standard C library to implement. Here is what the Standard has to say about this header:
4.1.3 Errors <errno.h>
The header <errno. h> defines several macros, all relating to the reporting of error conditions.The macros are
ED0M ERANGEwhich expand to integral constant expressions with distinct nonzero values, suitable for use in #if preprocessing directives; and
errnowhich expands to a modifiable lvalue92 that has type int, the value of which is set to a positive error number by several library functions. It is unspecified whether errno is a macro or an identifier declared with external linkage. If a macro definition is suppressed in order to access an actual object, or a program defines an identifier with the name errno, the behavior is undefined.The value of errno is zero at program startup, but is never set to zero by any library function.93 The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use of errno is not documented in the description of the function in the Standard.
Additional macro definitions, beginning with E and a digit or E and an upper-case letter,94 may also be specified by the implementation.
Footnotes:
92. The macro errno need not be the identifier of an object. It might expand to a modifiable lvalue resulting from a function call (for example, *errno()).
93. Thus, a program that uses errno for error checking should set it to zero before a library function call, then inspect it before a subsequent library function call. Of course, a library function can save the value of errno on entry and then set it to zero, as long as the original value is restored if errno's value is still zero just before the return.
94. See "future library directions" (4.13.1).
There are a few additional words on errno in the description of <math. h>:
4.5.1 Treatment of Error Conditions
The behavior of each of these functions is defined for all representable values of its input arguments. Each function shall execute as if it were a single operation, without generating any externally visible exceptions.For all functions, a domain error occurs if an input argument is outside the domain over which the mathematical function is defined. The description of each function lists any required domain errors; an implementation may define additional domain errors, provided that such errors are consistent with the mathematical definition of the function.105 On a domain error, the function returns an implementation-defined value; the value of the macro EDOM is stored in errno.
Similarly, a range error occurs if the result of the function cannot be represented as a double value. If the result overflows (the magnitude of the result is so large that it cannot be represented in an object of the specified type), the function returns the value of the macro HUGE_VAL, with the same sign (except for the tan function) as the correct value of the function; the value of the macro ERANGE is stored in errno. If the result underflows (the magnitude of the result is so small that it cannot be represented in an object of the specified type), the function returns zero; whether the integer expression errno acquires the value of the macro ERANGE is implementation-defined.
Footnote:
105. In an implementation that supports infinities, this allows infinity as an argument to be a domain error if the mathematical domain of the function does not include infinity.
Implementing <errno.h>
This is not a very demanding specification. You can write the file <errno. h> simply as:
/* errno.h standard header */ #ifndef _ERRNO #define _ERRNO #define EDOM 1 #define ERANGE 2 extern int errno; #endifIn some library file, you must add a definition for the data object:
int errno = 0;Your only other obligation is to store the values EDOM or ERANGE in errno at the appropriate places within the math functions. What could be simpler?Here is a case where the overt implementation is the easiest part of the job. errno causes trouble in two subtler ways sometimes its specification is too vague and sometimes it is too explicit. To see why takes some explaining. That is the major freight of this column.
History
Let's begin at the beginning. C was born under UNIX. That operating system set new standards for clarity and simplicity. The interface between user program and operating system kernel is particularly clean. You specify a system call number and a handful of operands. The 40-odd system calls of early UNIX have doubled in number. That is still on the spare side, compared to systems of comparable power. The operands are almost always scalars integers or pointers. They are equally spare.Each implementation of UNIX adopts a simple method for indicating erroneous system calls. Writing in assembly language, you typically test the carry indicator in the condition code. If the carry indicator is clear, the system call was successful. Any answers you requested are returned in machine registers or in a structure whose address you specify. If the carry indicator is set, the system call was in error. One of the machine registers contains a small positive number to indicate the nature of the error.
That scheme is great for assembly language. It is less great for programs you write in C. You can write a library of C-callable functions, one for each distinct system call. You'd like each function return value to be the answer you request when making that particular system call. You can do so, but that makes it difficult to report errors in a way that is easy to test. Alternatively, you can have each function return as its value a success or failure indication. Do that and you have no easy way to get at the answer you want from a successful system call.
One trick that mostly works is to do a bit of both. For a typical system call, you can define an error return value that is distinguishable from any valid answer. A null pointer is an obvious case in point. The value 1 can also be set aside in many cases, with no serious conflict with valid answers. Each UNIX system call usually has a return value that indicates that some form of error has occurred.
What the C-callable functions do not do is report exactly which error occurred. That strains the trick a bit too much. All you can tell from the return value is whether an error occurred. You have to look elsewhere to get details.
The "elsewhere" that early UNIX programmers adopted was a data object with external linkage. Any system call that fails stores the error code from the kernel in an int variable called errno. It then returns 1, or some such silly value, to indicate the error. Most of the time, the program doesn't care about details. An error is an error is an error. But in those few cases where the program does care, it knows how to get additional information. It looks in errno to see the last error code stored there.
Naturally, you'd better look before it's too late. Make another system call that fails and the error code gets over-written. You must also look at errno only after a system call that fails. A successful call doesn't clear the value stored there. It's not a great piece of machinery, but it does work.
Overworked Machinery
The first problem with errno is that it was too handy. People started finding additional uses for it. It grew from a dirty little trick for augmenting UNIX system calls to a C institution. And that's when it got overworked.System calls aren't the only rich source of errors. Another well explored vein is the portion of the library that computes the common math functions. This portion is largely declared in the standard header <math. h>.
Some functions yield values too large to represent for certain arguments (such as exp (1000.0)). Some yield values too small to represent for certain arguments (such as exp (-1000.0)). Some are simply undefined for certain argument values (such as sqrt(-1.0)). Some are defined, but of suspect worth for certain argument values (such as sin (1e30)).
You could introduce one or more error codes for each function that can run into trouble. Following the naming convention for UNIX error codes, you could report ESQRT for the square root of a negative number. That is both open-ended and messy.
Fortunately, math errors fall into just a few categories:
Several different system calls in UNIX can yield the same error codes. Similarly, several different math functions can yield one or more of these errors. (The errors can even occur for nearly all of the arithmetic operators, with floating point operands.) In fact, you can do an adequate job of covering all the math errors with just two error codes:
- An overflow occurs when a result is too large in magnitude to represent as a floating point value of the required type.
- An underflow occurs when a result is too small in magnitude to represent as a floating point value of the required type.
- A significance loss occurs when a result has nowhere near the number of significant digits indicated by its type.
- A domain error occurs when a result is undefined for a given argument value.
Loss of significance is a chancy error to report. One programmer's notion of a serious loss may be a matter of utter indifference to another programmer. Indeed, some very stable algorithms are insensitive to serious loss of significance in portions of a calculation. Hence, it is arguable whether significance loss should even be reported by the library.
- EDOM is reported on a domain error.
- ERANGE is reported on an overflow or an underflow.
You can see what's coming. Errors can occur in the math library much as they can occur on system calls. You need some way to report math library errors. So why invent yet another mechanism when you've already got one handy? An early, and natural, evolution of the C library was to report math errors by writing EDOM and ERANGE in errno. That practice has been blessed by inclusion in the C Standard.
The Problems
I said earlier that the specification of errno is both too vague and too explicit. Now I can elaborate.The vagueness comes from the historical use of errno to register system call errors. That practice has been implicitly endorsed by the C Standard. (See the quote from section 4.1.3 above.) Any library function can store nonzero values in errno. The stores can occur because the function makes one or more system calls that fail. Or they can occur because some function in the library chooses to use this reporting channel.
All you can count on is the behavior explicitly called out in the Standard. Call sqrt (-1.0) and you can be sure that errno contains the value EDOM. Call fabs (1.0) and all bets are off, believe it or not. No library function will store a zero in errno. Anything else is fair game.
Footnote 93 of the Standard (shown above) tells you the only safe coding style for using errno. Set it to zero right before a library function call, then test it before the next library call. It's rather a noisy channel.
The overspecification mostly affects the math functions. By spelling out when errno must be set, the Standard interferes with important optimizations. In partiular, the Standard makes it hard for compilers to use the newest floating point coprocessors to advantage.
Chips like the Intel 80X87 family and the Motorola 68881 have some pretty fancy instructions. Some can compute part or all of a math function with inline code. A smart compiler can dramatically speed up calculations by using these instructions. If nothing else, the compiler can avoid the function call and return overhead for a math function.
The problem comes when a mathematical exception occurs. These math coprocessors run autonomously, and they want to keep moving. They want to record an error by carrying along a special code, called NaN (for "Not a Number") or INF (for "infinity"). Later operations preserve these special codes. You can test at the end of a computation whether anything went wrong along the way.
At best, these coprocessors record an error in their own condition code. The main processor has to take pains to copy the coprocessor condition code into its own to test whether an error occurred. That stops a pipelined coprocessor in full career. If a C program must set errno on every math exception, it can run a math coprocessor at only a fraction of its potential speed.
Footnote 92 suggests one trick that can help. The Standard no longer requires that errno be an actual data object. It is now an rvalue macro. That gives the implementor considerable latitude. In particular, the errno macro can expand to an expression such as *_erfun(). Every time the program wants to check for errors, a function gets called to tell the program where to look.
That has two implications. First, the implementation can be lazy about recording errors. It can wait until someone tries to peek at errno before it stores the latest error code. That might give the implementation sufficient latitude to leave math coprocessors alone most of the time. (The translator may be hard pressed to exploit this opportunity, however.)
The second implication is that errno can move about. The function can return a different address every time it is called. That can be a tremendous help in implementing shared libraries. Static data storage is a real nuisance in a shared library, as I have mentioned in the past. Static data storage that the user program can alter is even worse. errno is the only such creature in the Standard C library. That's why I was one of the strongest proponents of changing it from a data object to an rvalue macro.
Conclusion
Even as a macro, errno is still an annoying piece of machinery. Any program can contain the sequence:
y = sqrt(x); if (errno == EDOM) .....The need to support such error tests severely constrains what an implementation can do with sqrt and its ilk. Since any library function can alter errno, programmers are also ill served. Here we have a mechanism that can be hard on the implementor and hard on the user.Perhaps with a bit more thought, X3J11 could have cleaned up this mess. I still don't know how. There are too many different floating point formats, even with the growing popularity of the IEEE 754 standard. And there are too many different ways that errors get reported in current systems. I find it sad that we couldn't do better than errno. But I'm not surprised.
Mea Culpa
You'd think I'd know better by now. I have taken others to task for publishing unchecked code. No matter how carefully you desk check, you're bound to miss things. There is no substitute for unleashing a compiler and a test program or two on whatever you write.Nevertheless, I managed to let some code appear in print before I tested it. The code I showed for <assert.h> contains a couple of errors. It appeared in the August 1990 CUJ under the title "With Gun and Reel."
Here is the correct version of <assert.h>:
/* assert.h standard header * copyright (c) 1991 by P.J. Plauger */ #undef assert /* remove existing definition */ #ifdef NDEBUG #define assert(test) ((void)0) #else /* NDEBUG not defined */ int _Assert(char *); /* macros */ #define _STR (x) _VAL(x) #define _VAL(x) #x #define assert(test) (void)((test) || _Assert( \ __FILE__ ":"_STR(__LINE_) " " #test)) #endifOne error was in the machinery for turning the macro __LINE__ into a string. My naive approach produced the string "__LINE__" instead of the current line number as desired. (Dan Saks tells me, however, that at least one compiler inadvertently got the right answer.)The other error lay in declaring _Assert to be a function returning void. That's an accurate declaration, but inappropriate if the function is to be called as the right operand of the | | operator. I could have replaced the operator with a conditional expression, as in:
#define assert(test) ((test) ?_Assert( \ __FILE__":" _STR(__LINE_) " " #test) \ : (void)0)I chose, instead, to declare __Assert as a function returning int, even though it never returns. Here is the (slightly) modified version of the function:
/* _Assert function * copyright (c) 1991 by P.J. Plauger */ #include <assert.h> #include <stdio.h> #include <stdlib.h> /* print assertion message and abort */ int _Assert(char *mesg) { fputs(mesg, stderr); fputs(" - assertion failed\n", stderr); abort(); return (0); /* should never be reached */ }Sorry about that.