Columns


Standard C

C and the Environment

P.J. Plauger


P.J. Plauger is senior editor of The C Users Journal. He is secretary of the ANSI C standards committee, X3J11, and convenor of the ISO C standards committee, WG14. His latest book is The Standard C Library, published by Prentice-Hall. You can reach him at cujed@rpub.com.

Introduction

The messiest part of any programming language is where it meets the outside world. Standard C leaves essentially all such issues to the Standard C library:

This episode discusses the last category. The header <stdlib.h> contains an odd assortment of functions. Those that communicate with the environment are probably the oddest of the lot. They tend to be very system dependent in the bargain. That makes it a challenge to present much portable code to illustrate how they work.

What the C Standard Says

7.10.4 Communication with the environment

7.10.4.1 The abort function

Synopsis

#include <stdlib.h>
void abort(void);

Description

The abort function causes abnormal program termination to occur, unless the signal SIGABRT is being caught and the signal handler does not return. Whether open output streams are flushed or open streams closed or temporary files removed is implementation-defined. An implementation-defined form of the status unsuccessful termination is returned to the host environment by means of the function call raise.

Returns

The abort function cannot return to its caller.

7.10.4.2 The atexit function

Synopsis

#include <stdlib.h>
int atexit(void (*func)(void));

Description

The atexit function registers the function pointed to by func, to be called without arguments at normal program termination.

Implementation limits

The implementation shall support the registration of at least 32 functions.

Returns

The atexit function returns zero if the registration succeeds, nonzero if it fails.

Forward references: the exit function (7.10.4.3).

7.10.4.3 The exit function

Synopsis

#include <stdlib.h>
void exit(int status);

Description

The exit function causes normal program termination to occur. If more than one call to the exit function is executed by a program, the behavior is undefined.

First, all functions registered by the atexit function are called, in the reverse order of their registration.128

Next, all open streams with unwritten buffered data are flushed, all open streams are closed, and all files created by the tmpfile function are removed.

Finally, control is returned to the host environment. If the value of status is zero or EXIT_SUCCESS, an implementation defined form of the status successful termination is returned. If the value of status is EXIT_FAILURE, an implementation-defined form of the status unsuccessful termination is returned. Otherwise the status returned is implementation-defined.

Returns

The exit function cannot return to its caller.

7.10.4.4 The getenv function

Synopsis

#include <stdlib.h>
char *getenv(const char *name);

Description

The getenv function searches an environment list, provided by the host environment, for a string that matches the string pointed to by name. The set of environment names and the method for altering the environment list are implementation-defined.

The implementation shall behave as if no library function calls the getenv function.

Returns

The getenv function returns a pointer to a string associated with the matched list member. The string pointed to shall not be modified by the program, but may be overwritten by a subsequent call to the getenv function. If the specified name cannot be found, a null pointer is returned.

7.10.4.5 The system function

Synopsis

#include <stdlib.h>
int system(const char *string);

Description

The system function passes the string pointed to by string to the host environment to be executed by a command processor in an implementation-defined manner. A null pointer may be used for string to inquire whether a command processor exists.

Returns

If the argument is a null pointer, the system function returns nonzero only if a command processor is available. If the argument is not a null pointer, the system function returns an implementation-defined value.

Footnote

128. Each function is called as many times as it was registered.

Using the Functions

The header <stdlib.h> defines two macros for determining program termination status:

EXIT_FAILURE — Use this macro as the argument to exit or the return value from main to report unsuccessful program termination. Any other non-zero value you use instead may have different meanings for different operating systems.

EXIT_SUCCESS — Use this macro as the argument to exit or the return value from main to report successful program termination. You can also use zero. Any other value you use may have different meanings for different operating systems.

The header <stdlib.h> also declares several functions that communicate with the environment:

atexit — Use this function to register another function to be called when the program is about to terminate. You may, for example, create a set of temporary files that you wish to remove before the program terminates. Write the function void tidy(void) to remove the files. Call atexit(&tidy) once you store the name of the first file to remove. When main returns or a function calls exit, the library calls all functions registered with atexit in reverse order of registry. The library flushes streams, closes files, and removes temporary files only after it calls all registered functions. You can register up to 32 functions with atexit.

exit — Call exit to terminate execution from anywhere within a program. Within function main you can either call exit or write a return statement. The argument to exit (or the return value for main) should be zero or EXIT_SUCCESS, described above, to report successful termination. Otherwise it should be EXIT_FAILURE, also described above.

If you plan to migrate to C++ in the future, you should probably avoid calling exit. Calling this function initiates program termination without calling the destructors for objects with dynamic storage duration. (These include arguments to function calls and declarations with storage class auto, either explicit or implicit.) That can subvert some of the tidiness, and type safety, engineered into certain classes. Get in the habit of returning from main instead. (The new exception machinery makes it easier for you to get back to main in a hurry, without bypassing the destructors.)

getenv — Use this function to obtain a pointer to the value string associated with an environment variable. (Most systems now support some form of environment variables that you can set and inspect with the command processor.) If you name an environment variable that has no definition, you get a null pointer as the value of the function. Don't alter the value string. A subsequent call to getenv can alter the string, however. To allocate a private copy, write something like:

#include <stdlib.h>

char *copyenv(const char *name)
   {  /*  get and copy environment variable */
   char *s1 = getenv(name);
   char *s2 = s1 ? malloc(strlen(s1) + 1) : NULL;
   
   return (s2 ? strcpy(s2, s1) : NULL);
   }
system — An implementation is not obliged to have system do anything useful. If the call system(NULL) returns a nonzero value, you know that the function invokes some sort of command processor. But the C Standard imposes no requirements on what such a creature does. The only portable use for system is to provide uncritical access to a command processor. An editor, for example, may accept a line that begins with an exclamation point. It passes the remainder of the line as the string argument to system. How the local command processor interprets the line is of no concern to the Standard C library proper.

Implementing the Functions

As I explained above, the C Standard permits each system to specify two preferred argument values for exit (or return values from main). The macro EXIT_FAILURE reports unsuccessful termination. The macro EXIT_SUCCESS reports successful termination. For historical reasons, the value zero also reports successful termination. Thus, I chose to tailor only the code for unsuccessful termination. The macro _EXFAIL typically has the value 1. This macro is defined in the internal header (for this implementation) "yvals.h".

Three functions deal with program termination — abort (Listing 1) , atexit (Listing 2) , and exit (Listing 3) . abort simply reports the signal SIGABRT. Should the handler for that signal return, the function exits with unsuccessful status. atexit is almost as simple. It just pushes a function pointer on the stack defined by the data objects _Atcount and _Atfuns. A call to exit pops this stack and calls the corresponding functions.

exit also closes any open files before it terminates program execution. How a program terminates is system dependent. You can usually call some function to do so, however. As with several other interface primitives, I stuff that problem into the internal header "yfuns.h". It either declares a function or defines a macro called _Exit that accepts the exit status and terminates execution. In a UNIX system, for example, _Exit can be just an alternate name for the exit system service.

Listing 4 shows the file getenv.c. It must know how to access the environment list that defines all the environment variables. It must also know how to walk that list to scan for an environment variable with the requested name. The version I show here works under UNIX. It also works under a variety of other operating systems. (The MS-DOS version depends on the memory model you choose, or you must use an occasional far pointer.)

getenv assumes that _Envp points to the first of a sequence of null-terminated strings. An empty string terminates the sequence. Each string in the sequence has the form name=value. If the argument string matches all characters before the equal sign, the function returns a pointer to the first character past the equal sign. Once again, I leave it to the internal header "yfuns.h" to define or declare _Envp.

Some operating systems support an environment list, but not of this form. Others support an environment list that is not directly addressable as a C data object. Either case may require that you copy the value string to a static buffer that is private to getenv. You must then introduce a function such as _Getenv that lets you supply your own static buffer to hold the value string. I chose to omit that layer of protection against future changes.

Listing 5 shows the file system.c. It shows how a UNIX version of the function system might invoke a command processor from a C program. As usual, the function assumes the existence of several UNIX system services with suitable reserved names. And as usual, the version I show here can be improved. Wiring in the pathname "/bin/sh" as the name of the command processor is at best naive, at worst bad manners. Several more sophisticated schemes are in common use for specifying an assortment of command processors. The function can also return more useful status information to programs that care.

This article is excerpted from P.J. Plauger, The Standard C Library, (Englewood Cliffs, N.J.: Prentice-Hall, 1992).