April 1992/Standard C

Columns

Standard C

The Header <stdlib.h>

P.J. Plauger

P.J. Plauger is senior editor of The C Users Journal. He is secretary of the ANSI C standards committee, X3J11, and convenor of the ISO C standards committee, WG14. His latest book is The Standard C Library, published by Prentice-Hall. You can reach him care of The C Users Journal or via Internet at pjp@plauger.uunet.uu.net.

Introduction
The header <stdlib.h> is a hodgepodge. Committee X3J11 invented this header as a place to define macros and declare functions that had no other sensible home:

Many existing functions, such as abs and malloc, had no traditional headers to declare them. X3J11 felt strongly that every functions should be declared in a standard header. If such a function seemed out of place in all other headers, it ended up declared in <stdlib.h>.

New groups of macros and functions ended up in new standard headers wherever possible. <float.h> and <locale.h> are clear examples. Additions to existing groups ended up in existing headers. strcoll, declared in <string.h>, and strftime, declared in <time.h>, are also fairly clear. Other macros and functions are harder to categorize. These ended up defined or declared in <stdlib.h>.
This header is not the only hodgepodge. I discuss the evolution of the header <stddef.h> in Standard C, CUJ December 1991.
To provide some structure, I organize the functions into six groups:

integer math (abs, div, labs, and ldiv) — performing simple integer arithmetic

algorithms (bsearch, qsort, rand, and srand) — capturing operations complex and widespread enough to warrant packaging as library functions

text conversions (atof, atoi, atol, strtod, strtol, and strtoul) — determining encoded arithmetic values from text representations

multibyte conversions (mblen, mbstowcs, mbtowc, wcstombs, and wctomb) mapping between multi-byte and wide-character encodings

storage allocation (calloc, free, malloc, and realloc) — managing a heap of data objects

environmental interactions (abort, atexit, exit, getenv, and system) — interfacing between the program and the execution environment
I discuss separately how to implement the functions in each of these groups. This month, I cover only the first two groups. I won't bother to present the header as a whole. It's pretty straightforward.

The C Standard on Integer Math

7.10.6 Integer arithmetic functions

7.10.6.1 The abs function

Synopsis

#include <stdlib.h> int abs(int j);

Description
The abs function computes the absolute value of an integer j. If the result cannot be represented, the behavior is undefined.130 [FN130. The absolute value of the most negative number cannot be represented in two's complement.]

Returns
The abs function returns the absolute value.

7.10.6.2 The div function

Synopsis

#include <stdlib.h> div_t div(int numer, int denom);

Description
The div function computes the quotient and remainder of the division of the numerator numer by the denominator denom. If the division is inexact, the resulting quotient is the integer of lesser magnitude that is the nearest to the algebraic quotient. If the result cannot be represented, the behavior is undefined; otherwise, quot * denom + rem shall equal numer.

Returns
The div function returns a structure of type div_t, comprising both the quotient and the remainder. The structure shall contain the following members, in either order:

int quot; /* quotient */ int rem; /* remainder */

7.10.6.3 The labs function

Synopsis

#include <stdlib.h> long int labs(long int j);

Description
The labs function is similar to the abs function, except that the argument and the returned value each have type long int.

7.10.6.4 The ldiv function

Synopsis

#include <stdlib.h> ldiv_t ldiv(long int numer, long int denom);

Description
The ldiv function is similar to the div function, except that the arguments and the members of the returned structure (which has type ldiv_t) all have type long int.

Using the Integer Arithmetic Functions
Here is a brief summary of the four arithmetic functions: abs — Call abs(x) instead of writing the idiom x < 0 ? -x : x. A growing number of Standard C translators generate inline code for abs that is smaller and faster than the idiom. In addition, you avoid the occasional surprise when you inadvertently evaluate twice an expression with side effects. Note that on a two's-complement machine, abs can generate an overflow.
div — You call div for one of two reasons:

div always computes a quotient that truncates toward zero, along with the corresponding remainder, regardless of how the operators / and % behave in a given implementation. This can be important when one of the operands is negative. The expression (-3)/2 can yield either -2 or -1, while div(-3, 2).quot always yields -1. Similarly, (-3)%2 can yield either 1 or -1, while div(-3, 2).rem always yields -1.

div computes both the quotient and remainder at the same time. That can be handy when you need both results. It might even be more efficient if the function expands to inline code that contains only a single divide.
Note that the members of the resulting structure type div_t can occur in either order. Don't make any assumptions about the representation of this structure.
labs — is the long version of abs.
ldiv — is the long version of div.

Implementing the Arithmetic Functions
Listing 1 shows the file abs.c. The absolute value function abs is the simplest of the integer math functions. You cannot provide a masking macro, however, because you have to access the value of the argument twice. Some computer architectures have special instructions for computing the absolute value. That makes abs a prime candidate for special treatment as a built-in function generating inline code.
Listing 2 shows the file div.c. It provides a portable implementation of the div function. You can eliminate the test if you know that negative quotients truncate toward zero. Most computer architectures have a divide instruction that develops both quotient and remainder at the same time. Those that develop proper negative quotients are also candidates for built-in functions. An implementation is at liberty to reorder the members of the structure type div_t to match what the hardware generates.
Listing 3 shows the file labs.c and Listing 4 shows the file ldiv.c. Both define functions that are simply long versions of abs and div.

The C Standard on the Algorithmic Functions

7.10.2 Pseudo-random sequence generation functions

7.10.2.1 The rand function

Synopsis

#include <stdlib.h> int rand(void);

Description
The rand function computes a sequence of pseudo-random integers in the range 0 to RAND_MAX.
The implementation shall behave as if no library function calls the rand function.

Returns
The rand function returns a pseudo-random integer.

Environmental Limit
The value of the RAND_MAX macro shall be at least 32767.

7.10.2.2 The srand function

Synopsis

#include <stdlib.h> void srand(unsigned int seed);

Description
The srand function uses the argument as a seed for a new sequence of pseudo-random numbers to be returned by subsequent calls to rand. If srand is then called with the same seed value, the sequence of pseudo-random numbers shall be repeated. If rand is called before any calls to srand have been made, the same sequence shall be generated as when srand is first called with a seed value of 1.
The implementation shall behave as if no library function calls the srand function.

Returns
The srand function returns no value.

Example
The following functions define a portable implementation of rand and srand.

static unsigned long int next = 1; int rand(void) /* RAND_MAX assumed to be 32767 */ { next = next * 1103515245 + 12345; return (unsigned int) next/65536) % 32768; } void srand(unsigned int seed) { next = seed; }

7.10.5 Searching and sorting utilities

7.10.5.1 The bsearch function

Synopsis

#include <stdlib.h> void *bsearch(const void *key, const void *base, size_t nmemb, size_t size, int (*compar const void *, const void *));

Description
The bsearch function searches an array of nmemb objects, the initial element of which is pointed to by base, for an element that matches the object pointed to by key. The size of each element of the array is specified by size.
The comparison function pointed to by compar is called with two arguments that point to the key object and to an array element, in that order. The function shall return an integer less than, equal to, or greater than zero if the key object is considered, respectively, to be less than, to match, or to be greater than the array element. The array shall consist of: all the elements that compare less than, all the elements that compare equal to, and all the elements that compare greater than the key object, in that order.129 [FN129. In practice, the entire array is sorted according to the comparison function.]

Returns
The bsearch function returns a pointer to a matching element of the array, or a null pointer if no match is found. If two elements compare as equal, which element is matched is unspecified.

7.10.5.2 The qsort function

Synopsis

#include <stdlib.h> void qsort(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *));

Description
The qsort function sorts an array of nmemb objects, the initial element of which is pointed to by base. The size of each object is specified by size.
The contents of the array are sorted into ascending order according to a comparison function pointed to by compar, which is called with two arguments that point to the objects being compared. The function shall return an integer less than, equal to, or greater than zero if the first argument is considered to be respectively less than, equal to, or greater than the second.
If two elements compare as equal, their order in the sorted array is unspecified.

Returns
The qsort function returns no value.

Using the Algorithmic Functions
Here is a brief summary of the four algorithmtic functions:
bsearch — Use this function to search any array whose elements are ordered by pairwise comparisons. You define the ordering with a comparison function that you provide. For example, you can build a keyword lookup function from the basic form as shown in Listing 5.
A few caveats:

If a key compares equal to two or more elements, bsearch can return a pointer to any of these elements.

Beware of changes in how elements sort when the execution character set changes — call qsort, described below, with a compatible comparison function to ensure that an array is properly ordered.

Be careful using the functions strcmp or strcoll, declared in <string.h>, directly. Both require that strings be stored in the array to be searched. You cannot use them to search an array of pointers to strings. To use strcmp, for example, you must write a function pointer argument that looks like (int (*)(const void *, const void *))&strcmp.
qsort — Use this function to sort any array whose elements are ordered by pairwise comparisons. You define the ordering with a comparison function that you provide. The comparison function has a specification similar to that for the function bsearch, described above. Note, however, that the bsearch comparison function compares a key to an array element. The sort comparison function compares two array elements.
A few caveats:

Don't assume that the function uses the "Quicksort" algorithm, despite the name. It may not. If two or more elements compare equal, qsort can leave these elements in any relative order. Hence, qsort is not a stable sort.

Beware of changes in how elements sort when the execution character set changes.

Be careful using the functions strcmp or strcoll declared in <string.h>, directly. Both require that strings be stored in the array to be sorted. You cannot use them to sort an array of pointers to strings. To use strcmp, for example, you must write a function pointer argument that looks like (int (*)(const void *, const void *))&strcmp.
rand — Call rand to obtain the next value in a pseudo-random sequence. You get exactly the same sequence following each call to srand with a given argument value. That is often desirable behavior, particularly when you are debugging a program. If you want less predictable behavior, call clock or time, declared in <time.h> to obtain an argument for srand. The behavior of rand can vary among implementations. If you want exactly the same pseudo-random sequence at all times, copy the version presented here.
Use RAND_MAX to scale values returned from rand. For example, if you want random numbers of type float distributed over the interval [0.0, 1.0], write the expression (float)rand()/RAND_MAX. The value of RAND_MAX is at least 32,767.
srand — See rand above. The program effectively calls srand(1) at program startup.

Implementing the Algorithmic Functions
Listing 6 shows the file qsort.c. It defines the related function qsort that sorts an array beginning at base. I introduced the type _Cmpfun just to simplify the declaration of arguments for the functions bsearch and qsort. Don't use this declaration in code that you write if you want it to be portable to other implementations.
This logic is much less simple and more debatable. It is based on the Quicksort algorithm first developed by C.A.R. Hoare. That requires you to pick a partition element, then partially sort the array about this partition. You can then sort each of the two partitions by recursive application of the same technique. The algorithm can sort quite rapidly. It can also sort very slowly.
How best to choose the pivot element is the debatable issue. Pick the first element and an array already in sort eats a lot of time. Pick the last element and an array in reverse sort eats a lot of time. Work too hard at picking an element and all arrays eat a lot of time. I chose simply to pick the last element. That favors arrays that need little rearranging. You may have reason to choose another approach.
qsort calls itself to sort the smaller of the two partitions. It loops internally to sort the larger of the two. That minimizes demands on dynamic storage. At worst, each recursive call must sort an array half as big as the earlier call. To sort N elements requires recursion no deeper than log2(N) calls. (You can sort 1,000,000 elements with at most 20 recursive calls.)
Listing 7 shows the file bsearch.c. The function bsearch performs a binary search on the sorted array beginning at base. The logic is simple but easy to get wrong.
Listing 8 shows the file rand.c. The function rand generates a pseudo-random sequence using the algorithm suggested in the C Standard. That has reasonable properties, plus the advantage of being widely used. One virtue of a random number generator is randomness. Another virtue, ironically, is reproducibility. You often need to check that a calculation based on pseudo-random numbers does what you expect. The arithmetic is performed using unsigned long integers to avoid overflows.
Listing 9 shows the file srand.c. The function srand simply sets _Randseed, the seed for the pseudo-random sequence generated by rand. I provide a masking macro for srand. Hence, the header <stdlib.h> declares _Randseed, defined in rand.c.

References
Donald Knuth, The Art of Computer Programming, Vols. 1-3 (Reading, Mass.: Addison-Wesley, 1967 and later). Here is a rich source of algorithms, complete with analysis and tutorial introductions. Volume 1 is Fundamental Algorithms, volume 2 is Seminumerical Algorithms, and volume 3 is Sorting and Searching. Some are in second edition.
You will find oodles of information on:

maintaining a heap

computing random numbers

searching ordered sequences

sorting

converting between different numeric bases
Before you tinker with the code presented here, see what Knuth has to say.
This article is excerpted in part from P.J. Plauger, The Standard C Library, (Englewood Cliffs, N.J.: Prentice-Hall, 1992).