Columns


Questions & Answers

More On const

Ken Pugh


Kenneth Pugh, a principal in Pugh-Killeen Associates, teaches C language courses for corporations. He is the author of C Language for Programmers and All On C, and is a member on the ANSI C committee. He also does custom C programming for communications, graphics, and image databases. His address is 4201 University Dr., Suite 102, Durham, NC 27707. You may fax questions for Ken to (919) 493-4390. When you hear the answering message, press the * button on your telephone. Ken also receives email at kpugh@dukemvs.ac.duke.edu (Internet) or dukeac!kpugh (UUCP).

Last month, one of the questions revolved around the const modifier.

The const type modifier is difficult to understand, especially when combined with pointers. Although some authors put it first in the declaration (along with the storage type), I am inclined to place the const modifier after the data type, since that more clearly reflects what the modifier is doing.

A non-modified type such as

int i = 5;
sets aside memory for the variable i and does not give any information as to whether the contents will be changed during the program's execution.

int const i = 5;
or

const int i : 5;
both declare that i is an integer constant. That is, its contents will not be modified during execution. Thus the compiler could place it in read-only memory (especially if it was a global or static variable) or it could use this information to optimize the code.

Although you probably would not use this for a single integer in place of a #define, you might use this for an array or structure that was initialized and never altered.

The keyword static is used for local variables whose memory is allocated at load time. You can initialize static variables and not change them. However the compiler does not prevent you from accidentally writing over their initial contents. The const variable is designed to protect against that occurrence. Its protection is compiler dependent, not absolute. A particular compiler/processor pair may not support read-only memory.

The possibilities with static and const are:

static int i = 5;            /*  Static variable whose contents may change */
static int const i = 5;      /*  Static variable whose contents will not be changed */
int const i : 5;             /*  Variable whose contents will not be changed */
Now that I've covered the simple use of const, I would like to discuss its use with pointers. There are two possibilities for placing const.

char const *pc1;  /*  const char *pc1; */
char * const pc2;
pc1 is a pointer that points to characters that are of type const. You cannot dereference the pointer (with a *) and assign something to that location. The following is valid:

char const c_const;
char const another_c_const;
char const *pc1;
pc1 = &c_const;
but you cannot then assign

*pc1 = 'A';
since you are attempting to change the char constant. But you can perform:

pc1 : &another_c_const;
since the contents of pc1 are not const, only what it points to.

On the other hand, pc2 is a constant pointer to characters that can change. For example, with:

char variable_c;
char another_variable_c;
char * const pc2 = &variable_c;
then

*pc2 = 'A';
is perfectly valid, but

pc2 = &another_variable_c;
is not, since you are trying to change the contents of pc2, which is to remain constant.

Because of pointers, I prefer placing the const after the type. If you declare:

const char *pc1;
the variable pc1 is not what is constant. It is the char to which it points. As shown previously, the way to declare that pc1 is constant is:

char * const pc1;
This arrangement appears inconsistent and confusing. Using the const after the type follows a regular pattern. If the const keyword appears just before the variable name, then that variable is const.

Note that you can assign a less constant type to a more constant type. For example, the following is valid:

char *pc3;
char const *pc1;
pc1 = pc3;
You just cannot use pc1 to alter any values being pointed at.

*pc3 = 'a';  /*  Valid */
*pc1 : 'a';   /*  Invalid */
We could use const in both places in the declaration. For example:

char const * const pc3 = c_const;
Both of the following are invalid:

*pc3 : 'A';
pc3 : &another_const;
Now for one more step. Lets look at double pointers, since that is the crux of your question. Given these declarations:

char const char_const;
char variable_char;
char * ptr;
char const * ptr_to_char_const;
char * const ptr_const_to_char = &variable_char;
char const * const ptr_const_char_const;
You have at least the following possibilities for the declaration and assignment of the double pointer:

char ** ptr_to_ptr_to_char;
char const ** ptr_to_ptr_to_char_const;
char * const * ptr_to_ptr_const_to_char;
char ** const ptr_const_to_ptr_to_char = &ptr;

ptr_to_ptr_to_char = &ptr;
ptr_to_ptr_to_char_const = &ptr_to_char_const;
ptr_to_ptr_const_to_char = &ptr_const_to_char;
Notice that the last of these declarations has to include the initialization, since the variable itself is being declared as const. I won't go into an explanation of all of these. You can work out the remaining possible variable declarations as:

char const * const * ptr_to_ptr_const_to_char_const;
When a pointer to const appears in a function's parameter declaration, it implies that the function will not change the variable pointed at by the parameter that is passed. The actual variable pointed at by the parameter can be a non-const variable.

This is not needed for the parameters themselves, as a copy is always passed to the function. With ints, for example, it is redundant to state in a prototype:

int integer_compare_function ( int const integer1,
   int const integer2);
For string pointers, which may be changed by the called routine, it is also redundant to state:

int change_strings(char * const string);
Although it is extraneous in the prototype, you may wish to specify the parameters in the function header this way. Inside the function you would not be able to accidentally alter their values.

If the actual strings are not going to be changed, then you would code the prototype as:

int string_compare_function(char const *string1,
   char const *string2);
Inside the function, you could not make assignments as:

*string1 = 'a';
but you could perform:

string1 = string2;
If the function were being passed handles (pointers to pointers), then the function could be prototyped as:

int string_handle_compare_function(char const * const
   * string1, char const * const * string2);
You would not be permitted to use either:

* string1 = *string2;
**string1 = 'a';

Readers' Replies

Indentation

Generally, I find your column most interesting. Keep up the good work. However, I was a bit disappointed by your comments on Brian S. Merson's letter in the September issue.

You said, "Your style has a lot going for it." While there is nothing particularly bad about his style of indentation, it seems to me to be a classic case of a solution to a non-problem for two reasons:

1) Even given his program as it stood, he could just have drawn a line to the left of the braces which would not then have run through the first character of the code. Ensuring that all the code was legible would seem to be a basic precaution taken before debugging.

2) What are aflag and eflag doing in a serious program? Variables should have meaningful names with a minimum Hamming distance between them, i.e., they should differ in a number of positions.

Donal Lyons
Dublin, Ireland

The style to which you refer in #1 looked like:

for (...)
   {
   /* Internals indented one space to right of brace */
   }
I agree with you regarding #2. That's why I started the Obscure Name Contest. (KP)

Pointer Blues

There may be a simpler explanation than "pointer blues" for the problem mentioned by Mark Petrovic in the Q?/A! column in The C Users Journal September 1990. He reported that some simple programs did not run correctly until a printf() statement was added for debugging. Some C output functions are buffered and do not appear to work when stepping through a program in debug mode, until the buffer is dumped — which a printf() statement does. His description of the way in which the programs didn't work isn't detailed enough to know if this is what he was encountering.

Janet Price
Kalamazoo, MI

You are correct. The printf buffer is usually not dumped to the screen until a \n appears. Without the code though, it is difficult to state precisely what went on. (KP)

repeat_format

In your Q?/A! column (The C Users Journal, September 1990, page 111), I noticed another reader (Doug Oliver, Wichita, KS) had written a function, repeat_format, which expands format strings with embedded repeat specifications to make printf-compatible format strings. He pointed out that the function returned the address of data local to repeat_format, which technically is not guaranteed valid, but which may be copied and used anyway.

The reason this data may be so used has to do with stack usage. Each time the function repeat_format is called, space is reserved on the stack for local data. My compiler (Turbo C 1.5), and perhaps his (he used QuickC), places the string space for newfmt[] farthest away in the stack area from BP (the processor register relative to whose value local variables and parameters are referenced in the compiled functions' code). Also, the 256 bytes reserved for newfmt[] in repeat_format is nearly 100 bytes longer than any of the format strings generated by the sample driver.

In the sample driver and in the recursive calls within repeat_format, it is used as the source parameter to strcpy, e.g. strcpy(newstr, repeat_format(fmtstr[i])), whose stack uses a portion of repeat_format's now-invalid data area. Fortunately, in this case other local variables, which are not used for anything after the function returns, are the victims.

To verify this, I changed a copy of the C source so that newstr[152] was used as the last local data item declared in repeat_format. (152 is only slightly longer than the minimum string length necessary to hold the longest format string returned to the sample program.) The result was that, when run, garbage was displayed at the end of the longest format string. As long as repeat_format is left in its original form, and used as a source parameter for strcpy, it should be safe-tempstr[]. The other local variables provide more than enough room for strcpy to execute safely.

It should be noted, though, that the function is somewhat stack-hungry, requiring approximately 350 bytes minimum per call, and could run into trouble in some memory models if deeply-nested format strings were expanded in a data-intensive application.

I also found it interesting to use this function with dprintf, which also appeared in The C Users Journal (September 1990). By modifying dprintf so that it uses repeat-format to expand the format string passed to it, and passing this result to vdprintf, format strings with repeat format specifiers may be treated as ordinary format specifiers in calls to dprintf.

While testing these functions, I decided to check them for Microsoft compatibility with Learn C, a subset of QuickC developed by Microsoft and marketed by Microsoft Press. Unfortunately, dprintf would not run under Learn C, but, instead, generated the message unresolved external: ftol. I subsequently found out that it would not convert from float to int. I am still waiting to hear from Microsoft about the problem as of this writing.

Thank you. I enjoy your column and The C Users Journal.

John W. Bandy
Armuchee, GA

Thank you for your analysis of the code. Using the address of an automatic variable, when that variable is no longer allocated, is a cause for indeterminate code. The algorithm here will work as long as the processor/compiler does not make the previously allocated storage unavailable (i.e., to cause an addressing exception to occur).

One could allocate a large static buffer to fill up with the generated string and return the address of that buffer. This is the way that some standard library routines work. Whether statics or automatics are used, the routine should check to see it does not exceed the length of the array. Some compilers allocate automatic variables on the same stack used for return addresses. Exceeding array limits causes interesting debugging problems on those machines. (KP)

External Identifiers

I would like to address Andreas Lang's question in the October 1990 issue about keeping all public variables, extern or not, in one header file. Learning from other programmer's code and coming up with a few ideas of my own, I have developed a scheme for just this purpose.

Listing 1 shows how I would write the example in question. MAIN is #defined in a file that includes this header file, EXT is expanded to nothing, and the initialization is taken by the preprocessor. Otherwise, EXT is expanded to extern and the initialization is skipped.

I have found this scheme very useful, I can keep a single header file for variables and initialize them too.

Bill Sharar II
Denton, Texas

I was very much interested in your response to Andreas Lang in the October 1990 C Users Journal. He is attempting to create a single header file to be shared among any number of source files. This header file would allow reference to the global variables for the program by declaring them as external to all the source files except for one (the "main"source file). His attempt works well except in the case where a global variable must be initialized. I have another solution to the problem, which, though inelegant, would allow him to do what he wishes.

Using his example, test.h would look like Listing 2.

The remainder of his example would be unchanged. (I find the use of the keyword Global clearer than the EXTERN used in the example. It better defines what we are trying to do, which is declare a global variable.)

Yes, it is kind of ugly, but it does work (at least when using Microsoft C v5.1). It permits the same code string to declare, define, and initialize the variable as needed, since everything is in one place, with all the associated maintenance benefits.

In that same issue, in your response to Frederick C. Smith, you state that stderr cannot be redirected under MS-DOS. This statement requires clarification. While it is true that there is no command-line facility available to redirect stderr, such as >& in the C-shell under UNIX, redirection under program control is possible by calling the freopen function:

freopen("error.log", "w", stderr);
system("cmd");
(These functions do return values that should be checked, but I omitted this for the sake of brevity.)

The previous code sequence would redirect to the new file error.log, then run the program cmd. Any error messages from cmd would then be redirected to error.log (assuming they were written to stderr).

I look forward to your column each month in The C Users Journal. I find it educational and well-written, and like the rest of the magazine, well worth reading. Thank you.

David Hansen

I think I have a solution to Andreas Lang's problem (October 1990) about initializing external variables in header files without having to write them in any other files.

The macros EXTERN and INIT are defined in the header file as shown in Listing 3. The variable is then written in the header file as:

EXTERN int i INIT(5);
So, in the main file, the preprocessor produces:

int i = 5;
and in all other files it produces:

extern int i;
Note that the INIT macro will work independent of val's type (I think). Is there anything questionable (or just plain wrong) about this method?

Larry Leonard
Norcross, GA

The only problem with this method is that it looks awkward when initializing an array or anything that requires more than one line of initial values. For example:

EXTERN int array[15] INIT({1,2,3,4,5,6,7,8,9,10,\
         11,12,13,15});
Notice both the braces next to parentheses and the need for the \ to carry the initializing string to the next line. [Some older compilers required the backslash, but it is not required in standard C. - pjp]

Even given the previous three responses, I think I still prefer using a single header file that contains the variable declarations with the keyword extern and a separate source file with the declarations and initial values.

Most of the time I try to avoid using global variables entirely and use access routines that look like:

static int global_variable = 5;
set_global_variable (value)
   {
   global_variable : value;
   }
get_global_variable(value)
   {
   return value;
   }
There is no header file to include and only one place where the definition has to appear. (KP)