Columns


Standard C

The Header <limits.h>

P.J. Plauger


P.J. Plauger is senior editor of The C Users Journal. He is secretary of the ANSI C standards committee, X3J11, and convenor of the ISO C standards committee, WG14. His latest book is Standard C, which he co-authored with Jim Brodie. You can reach him at pjp@plauger.uunet.

History

One of the first attempts at standardizing any part of the C programming languages began in 1980. It was begun by an organization then called /usr/group, now called Usenix. As the first organization founded to promote UNIX commercially, /usr/group had a stake in vendor-independent standards. Technical developments couldn't simply go off in all directions, nor could they be dictated solely by AT&T. Either way, it was hard to maintain an open marketplace.

So /usr/group began the process of defining what it means to call a system UNIX or UNIX-like. They formed a standards committee that focused, at least initially, on the C programming environment. That's where nearly all applications were written, anyway. The goal was to describe a set of C functions that you could expect to find in any UNIX-compatible system. The descriptions, of course, had to be independent of any particular architecture.

A chunk of what /usr/group described was the set of C-callable functions that let you access UNIX system services. An even larger chunk, however, was the set of functions common to all C environments. That larger chunk served as the basis for the language portion of the C Standard. Since Kernighan and Ritchie chose not to discuss the library except in passing, the /usr/group standard was of immense help to X3J11. It saved us many months, possibly even years, of additionl labor.

As an aside, the /usr/group effort served another very useful purpose. IEEE committee 1003 was formed to turn this industry product into an official standard. The IEEE group turned over responsibility for the system-independent functions to X3J11 and focused on the UNIX-specific portion. You know the resultant standard today as IEEE 1003.1, a.k.a. POSIX.

Part of building an architecture-independent description is to recognize what changes across machines. You want to avoid any unnecessary differences, to be sure. The rest you want to identify and to circumscribe. Some critical value might change when you move an application program to another flavor of UNIX. So you give it a name. You lay down rules for testing the named value in a program. And you define the limits that the value can range between.

A long-standing tradition in C is that scalar data types are represented in ways natural to each machine. The fundamental type int is particularly elastic. It wants to be a size that supports efficient computation, at least within broad limits. That may be great for efficiency, but it's a real nuisance for portability.

/usr/group invented the standard header <limits.h> to capture many important properties that can change across architectures. It so happens that this header deals exclusively with the ranges of values of integer types. That was all that /usr/group chose to address. When X3J11 decided to add similar data on the floating types, we elected not to overwhelm the existing contents of <limits.h>. Instead, we added the standard header <float.h>. Perhaps we should have renamed the existing standard header <integer.h>, but we didn't. Tidiness yielded to historical continuity.

Using <limits.h>

You can use <limits.h> one of two ways. The simpler way assures that you do not produce a silly program. Let's say, for example, that you want to represent some signed data that ranges in value between VAL_MIN and VAL_MAX. You can keep the program from compiling incorrectly by including in the text:

#include <assert.h>
#include <limits.h>
#if VAL_MIN < INT_MIN \
   || INT_MAX < VAL_MAX
#error values out of range
#endif
You can safely store the data in variables declared with type int if the error directive is skipped.

A more elaborate way to use <limits.h> is to control the choice of types in a program. You can alter the example above to read:

#include <assert.h>
#include <limits.h>
#if VAL_MIN < LONG_MIN \
   || LONG_MAX < VAL_MAX
typedef double Val_t;
#elif VAL_MIN < INT_MIN \
   || INT_MAX < VAL_MAX
typedef long Val_t;
#else
typedef int Val_t;
#endif
You then declare all variables that must hold this range of values as having type Val_t. The program will use the computationally most efficient type for a given target environment.

The presence of <limits.h> is also designed to discourage an old programming trick that is extremely non-portable. Some programs attempted to sniff out the properties of the target environment by writing tricky if directives, as in:

#if (-1 + 0x0) >> 1 > 0x7fff
/* must have ints greater than 16 bits */
.....
#endif
This code assumes that whatever arithmetic the preprocessor performs is the same as what occurs in the execution environment. Those of us who deal heavily with cross compilers know well that the translation environment can differ markedly from the execution environment. For tricks like this one to work, the C Standard would have to require that the translator mimic the execution environment very closely. And compiler families with a common front end would have to adapt translation-time arithmetic to suit the target.

X3J11 discussed such requirements at length. In the end, we decided that the preprocessor was not the creature to burden with such stringent requirements. The translator must closely model the execution environment in many ways, to be sure. It must compute constant expressions — the things you use to initialize static data objects, for example — to at least as wide a range and precision as the target. But it can largely define its own internal environment for the arithmetic within if and elif directives.

So to test the execution environment you can't do experiments on the preprocessor. You must include <limits.h> and test the values of the macros it provides.

What The Standard Says

Here is what the Standard has to say about <limits.h>. The library portion contains only a brief reference:

4.1.4 Limits <float.h> and <limits.h>

The headers <float.h> and <limits.h> define several macros that expand to various limits and parameters.

The macros, their meanings, and the constraints (or restrictions) on their values are listed in 2.2.4.2. [end of extract]

That's it. For the meat, you have to go back to the environment section:

2.2.4.2 Numerical Limits

A conforming implementation shall document all the limits specified in this section, which shall be specified in the headers <limits.h> and <float.h>.

2.2.4.2.1 Sizes of Integral Types

<limits.h>

The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. Moreover, except for CHAR_BIT and MB_LEN_MAX, the following shall be replaced by expressions that have the same type as would an expression that is an object of the corresponding type converted according to the integral promotions. Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.

If the value of an object of type char is treated as a signed integer when used in an expression, the value of CHAR_MIN shall be the same as that of SCHAR_MIN and the value of CHAR_MAX shall be the same as that of SCHAR_MAX. Otherwise, the value of CHAR_MIN shall be 0 and the value of CHAR_MAX shall be the same as that of UCHAR_MAX.9

Footnote

9. See 3.1.2.5 [end of extract]

The only significant addition to this header from the days of /usr/group is the addition of MB_LEN_MAX. I will discuss multibyte characters at length in a future column.

Several people reviewing the Standard complained that names such as USHRT_MAX are barbaric. It is silly to omit a single vowel in the interest of terseness, particularly in this age of 31-character (or longer) names. I can only plead, on behalf of X3Jll, the same excuse we gave for not changing the name of the header itself. We couldn't justify abandoning prior art just to be a bit tidier. Besides, fixing such barbarisms only in this place is like eating one peanut.

Implementing <limits.h>

The only code you have to provide for this header is the header itself. All the macros defined in <limits.h> are testable within an if directive and are unlikely to change during execution. (The same is not true of most of the macros defined in <float.h>.)

Most modern computers have eight-bit bytes, two-byte shorts, and four-byte longs. There are several common variations on this principal theme:

I found it convenient, therefore, to write a version of <limits.h> that expands to any of these common forms. It includes, as needed, a configuration file called <yvals.h>. Among other things, this file defines the macros:

Listing 1 shows the code for <limits.h>.

The use of the macro _2C obscures an important subtlety. On a 2's-complement machine, you cannot simply write the obvious value for INT_MIN. Why not? On a 16-bit machine, for example, the sequence of characters -32768 parses as two tokens — a minus sign and the integer constant with value 32,768. The latter has type long because it is too large to represent properly as type int. Negating this value doesn't change its type. The Standard requires, however, that INT_MIN have type int. Otherwise, you can be astonished by the behavior of a statement as innocent looking as:

printf("range is from %d to %d\n",
   INT_MIN, INT_MAX);
The only safe thing is to sneak up on the value by writing an expression such as (-32767-1). Given the way I chose to parametrize <limits.h>, you get this trickery for free.

One other subtlety should not be overlooked. I made the point earlier that preprocessor arithmetic need not model that of the execution environment. You can, in principle, compile on a host with 32-bit longs for a target with 36-bit longs. Nevertheless, the host is obliged to get the values in <limits.h> right. That means that it must do preprocessor arithmetic to at least 36 bits. The latitude spelled out for implementors by X3J11 isn't so broad after all.

Testing <limits.h>

Listing 2 is a brief sanity check you can run on <limits.h>. It is by no means exhaustive, but it does tell you whether the header is basically sane.

Note that all the action occurs at translation time. That's because all the macros must be usable within if directives. If this test compiles, it will surely run, print its success message, and exit with happy status.

You might try this test on your favorite compiler. I can only comment that I have known it to fail on some popular offerings.