Columns


Questions and Answers

More Pointer Problems

Ken Pugh


Kenneth Pugh, a principal in Pugh-Killeen Associates, teaches C language courses for corporations. He is the author of C Language for Programmers and All On C, and was a member on the ANSI C committee. He also does custom C programming for communications, graphics, image databases, and hypertext. His address is 4201 University Dr., Suite 102, Durham, NC 27707. You may fax questions for Ken to (919) 493-4390. When you hear the answering message, press the * button on your telephone. Ken also receives email at kpugh@dukemvs.ac.duke.edu (Internet).

Q

I have a wall full of technical manuals, none of which even begins to discuss the OBJ file format. Can you tell me where I can get information on OBJ file internals?

Dennis Taylor
Vancouver, BC
CANADA

A

You did not say which system you are working on. The Intel Relocatable Object Module Format is in the back of the Microsoft MS-DOS Programmer's Reference Manual. The date on this edition is 1984. Perhaps they have removed it from the later editions.

Q

I hope you can help me. I have a problem compiling a certain C source code (Listing 2) with Microsoft Quick C.

The file is DSAT.C, a dynamic string array test program. It consists of three functions plus a main to coordinate them. The first, wordcount, is flawless. The second, str_to_ptrarray, does not work the way I want it to, and I do not know why.

The str_to_ptrarray function should break the passed string into words (characters separated with a space) and compose a null-terminated string array of the words in the pointer array supplied. The function returns the number of words.

Hence,

int nwords;
char mystr[ ] = "This is a test.",
   **strlist;
nwords = str_to_ptrarray(mystr,
   &strlist);
should return the values with results as in Listing 1.

The function allocates all space necessary for the transformation and I want to keep it that way! I do not want to have to calculate memory usage each time I call the function separately — I want the function to handle it.

Inside the function, it works perfectly with QuickC 2.00 (as you can see with my four lines of debugging), but outside the function (back in main, for example) the work has gone poof. I get the first element along with a null pointer assignment runtime error (R6001). With QuickC 2.51, it does not seem to do anything, and the compiler gives me a Near Pointer Error.

Then free_ptrarray simply frees the memory allocated by the str_to_ptrarray function.

I have looked at this for too long. I need your help. Why does this not work? Am I missing a fundamental rule of C programming? Please please please help! Your time is profoundly appreciated. Thank you.

Anthony Whitford
Sidney, BC
CANADA

A

Ah, triple pointers — more than twice as bad as double pointers. I have to admit, it took me a few minutes to figure out what was wrong with your program. I got slightly different errors the first time I ran it, but that was due to mistyping. For future reference, if you or another reader has a problem, please send the code on a 5.25-inch MS-DOS disk.

Your problem revolves around the precedence of the index operator [ ] and the indirection operator *. The former has higher precedence.

*ptrarray [count]
is evaluated as:

* (ptrarray [count])
or equivalently as:

* (* (ptrarray + count))
This adds the value of count to ptrarray before doing the indirection.

What you want is:

(*ptrarray) [count]
which is equivalent to:

* ((*ptrarray) + count)
Simply replace all instances of *ptrarray[x] with (*ptrarray)[x], and your problems should be solved.

The lines below /* THIS IS A NEW PRINTF */ clarify the difference. Run your program again and see what values are printed out. (If you use the large memory model, use "%lx".)

Two brief comments on the code. First, I dislike dealing with anything more than double pointers. You can eliminate the use of the double pointer (except for the actual assignment to the parameter) by having a local variable called local_ptr_array and declared as:

char **local_ptr_array;
The code would read like:

if  ((local_ptr_array =
   (char **)calloc(words + 1,...)))
. . .
local_ptr_array[index] = . . .
At the end, simply code:

/*  Set the address in the parameter */
*ptr_array = local_ptr_array;
This would have eliminated your problems, because no indirection operator would have been involved.

Second, I would replace the calls to calloc as:

(*ptrarray) [index] =
   (char *)calloc((size_t)(strlen(cptr) + 1),
      sizeof (**ptrarray [index] ) );
with:

(*ptrarray) [index] =
   (char *)calloc((size_t)(strlen(cptr) + 1),
      sizeof(char));
Since you cast the return to char *, the size of the elements should be the size of a char. Although your approach is technically correct, it appears slightly less readable.

Q

I need to be able to execute a program (Listing 3) in UNIX, and no matter what the return value for that program is, I need to return zero (success). This is because my menu system beeps and displays unwanted messages when the user hits the interrupt key to abort the program.

My solution is to write a shell program that performs the following:

1) fork a process
2) if parent then
   2.1. ignore interrupt
   2.2. wait for child to finish
3) if child then
   3.1. execute the program passed as an argument
4) return zero
The problem is that if the interrupt key is pressed, then the program returns 2000 instead of zero. If the program ends normally, then the program returns zero (fine). Also, if I instruct the program to return any value other than zero, it does so even if the interrupt key is pressed (as it should).

Tim Riley
Miami Lakes, FL

A

Good question. I tried it on my UNIX machine (a Sequent) and got exactly the same results as you did. It looks OK to me (either that or I'm tired out after considering triple pointers). Any thoughts out there? (KP)

Reader's Replies

The Function Prototype

I write regarding Firdaus Irani's Q?/A! question about his problems with the compare function for qsort in Turbo C++ (which you duplicated in Microsoft C). I may have missed something, but my understanding is that qsort is defined as accepting a pointer to a function that returns an int and takes two const void * parameters. Thus, when Mr. Irani defined his function as taking two unsigned char ** parameters, there was a mismatch. I have used the qsort function under Turbo C, Turbo C++, Microsoft C 5.1 and 6.0, and IBM C/400 (on the AS/400), and the implementation has always been as described. Further, the documentation with both Turbo C(++) and Microsoft C state that qsort is defined in ANSI C.

To use qsort as Mr. Irani desires, he should declare his function as:

int comp (const void *, const void *);
and call qsort just as he did. His comp function should be as follows:

int comp (void const *a, void const *b)
   {
   return (strcmp(*((unsigned char **) a),
                   *((unsigned char **) b));
}
The casts take care of the warnings on the parameters to the strcmp function.

The important thing to note here is that the const void * parameters to the compare function of qsort are so that any type of item can be compared. For instance, in a utility I have written, I have a structure I use for building a typical directory tree, defined as follows:

typedef struct
   {
   char achPathName[MAXPATH];
      //MAXPATH is defined in the C header files
   char achPathPic[MAXPATH];
   unsigned short usPath Level;
   } DIR_INFO;
This structure is used to hold a directory path name (such as C: \TC\INCLUDE), a path "picture" that corresponds to the path name (such as -- --INCLUDE), and the level of the directory, the root being 0. An array of these structures is declared as:

DIR_INFO stDirInfo[MAX_DIRS];
   //MAX_DIRS is a #define
After filling this array using findfirst and findnext, I want to sort it alphabetically (yet maintain the proper levels). To do this I declare a compare function for qsort:

int DirSort(const void *, const void *);
I then call qsort with:

qsort(stDirInfo, usCurNumDirs,
   sizeof(stDirInfo[0]), DirSort);
   //usCurNumDirs = # of directories found
The DirSort function is shown in
Listing 4.

Using the (DIR_INFO *) cast, I can sort the array of structures based on a structure member.

Again, I may have missed something, but I thought the usage was straightforward. Despite PJP's note to the contrary, I certainly don't see how this is a bug in Turbo C++, especially if this is the ANSI definition for the qsort function.

A. Donnie Hale, Jr.
Hilliard, OH

Regarding the TC++ question from Firdaus Irani, I have come across the same question and solved it in Listing 5. As I think about it, I think that TC++ is correct, because as I understand C++ will not automatically cast a void pointer, it must be done explicitly. The following works correctly and does not require changes to the stdlib.h.

I greatly appreciate your column, many of the questions hit close to home.

Jay Holovacs
Warren, NJ

In regard to a question from Firdaus Irani in the December 1990 issue of the C User's Journal, page 92, using qsort.

The problem is not in the Turbo C++, but in Firdaus Irani's usage of the function. First, any compare function always has the prototype:

int compare(const void *, const void *);
The arguments to the compare function are then cast to the appropriate type within the user-defined compare function.

My example (see Listing 6) has three different sort functions. I read in a list of student names and their corresponding grades. The three data structures are:

Each structure is sorted (the array of structure is sorted by its grade entry) and the results were printed.

Input file used:

Martin    3.5
Sheila    4.0
Marcel    2.7
Henry     2.9
Kemberly  3.8
Cindy     1.7
The resulting output file (sorted structure, sorted names, sorted grades):

Cindy     1.70  Cindy     1.70
Marcel    2.70  Henry     2.70
Henry     2.90  Kimberly  2.90
Martin    3.50  Marcel    3.50
Kimberly  3.80  Martin    3.80
Sheila    4.00  Sheila    4.00
The user must cast the argument type within the compare function (see the compare function definitions). I do not think that there is a bug in Turbo C++'s qsort.

Martin Schlapfer
Scotts Valley, CA

I have enclosed a question and answer from your December 1990 C Users Journal column. I think both you and P.J. Plauger missed the boat on this one.

F. Irani asks why he gets an error when attempting to use qsort. He changed the prototype for qsort in stdlib.h to get rid of the error message (!) and asks "Does that mean I have to make changes to stdlib every time I try to compile a program using qsort that calls a compare function with different parameter types?". He says he got no help from Borland tech support. I don't think he got any from you or PJP either.

I think you should have answered him like this:

Don't change your stdlib.h. The prototype for qsort indicates what qsort expects from you, and fooling the compiler is not the answer. The answer is to give qsort just what it expects: a compare function that is defined to receive void * pointers. If you are actually comparing unsigned char **, then you must use a cast inside your compare function.

For example your comp should have been:

int comp(const void *a, const void *b)
   {
   return strcmp((char *)*(unsigned char **)a,
                 (char *)*(unsigned char **)b);
}
Note the casts to char *. This is because strcmp is defined to expect char *, even though it is guaranteed to compare the strings as if they are unsigned characters. (See sec. 4.11.4 in the ANSI standard.)

Now, aside from the annoying error messages, why is it wrong to define the compare function to accept something other than void * pointers?

When you call a function, you must pass it what it expects. Likewise, when a library function calls your function, your function must expect what the library function passes. There is nothing in the standard to prevent an implementation from having different representations for different types of pointers (except that void * pointers must have the same representation and alignment requirements as char * pointers, according to sec. 3.1.2.5). If a function call passes void * and the function is defined to expect unsigned char **, anything can happen. This is potentially just as dangerous as passing an int to a function expecting a long. The function picks up what it expects from the stack (or registers), and it had better be right. The whole purpose of the prototype system is to protect against mistakes like these, and arbitrarily changing a prototype to get rid of an error or warning message is like turning off an alarm system because it keeps going off. Better to find the problem, understand it, and fix it.

P.S. It occurs to me that some of the misunderstanding may be due to a perception that, since void * pointers may be arbitrarily assigned to and from other pointer types without casts, that they may also be passed to a function expecting a different pointer type, or that a function defined to accept a void * may be passed a different type. This is wrong. The compiler "knows" how to deal with assignments and can do the necessary casting automatically. The compiler can automatically cast a pointer when it is passed, if a prototype is in scope. But in the case of qsort, the calling is being done "blindly" by qsort, which has no knowledge of what it is really dealing with. So it passes void * pointers. There is no way for the compiler to automatically convert these on the "receiving end."

Raymond Gardner
Englewood, CO

You are right that there are some misconceptions with pointers to void. C++ has tightened the usage of void pointers, so that assignment requires a cast, but Standard C permits assignments without casts. For the benefit of our readers, let's examine the uses for void pointers. The major area is in places where char * was employed before. For example, the prototype for memset is:

memset(void *address, int byte, size_t length);
We can pass memset something like:

struct s_test
   {
   int member;
   . . .
   };
struct s_test test;
memset(&test, 0, sizeof(test));
You do not have to cast something like this:

memset( (void *) &test, 0, sizeof(test));
because void * is assignment-compatible with any other pointer.

Now let's take the case of qsort. Suppose you want to compare structures of s_test type in a function that would be passed to qsort. Your function must be prototyped as

compare_test_struct(const void *one, const void *two);
to meet the _fcmp(const void *, const void *) requirement in the qsort prototype.

The function would need to look like:

compare_test_struct(const void *one, const void *two)
    {
    if  (  ((struct s_test *) one) ->
        member > ((struct s_test *) 
        two) -> member)
        return 1;
. . .
}
or

compare_test_struct(const void *one, const void *two)
   {
   struct s_test *one_test = one; 
   struct s_test *two_test = two 
   if ( one_test->member > 
      two_test -> member)
      return 1;
. . .
}
Either way appears to make more hidden the real purpose of the function - to compare two structures. The parameters one and two will have to contain valid beginning addresses for structures. Of course, qsort is designed to supply valid ones.

Note that you can call qsort with any function that expects two void pointers and the prototype will not complain. With the above and your example function, you could code:

struct s_test struct_array[10] =
   { /* Some list */... };
qsort (struct_array, 10,
   sizeof(struct s_test), comp);
or

char *array_of_strings[10];
qsort(array_of_strings, 10,
   sizeof(char *), compare_test_struct);
These are both equally valid prototype-wise, but absolutely wrong. You may get a storage access error in the latter case, if the variable alignment is not proper. The prototype for qsort in this case has not protected you from shooting yourself in the foot.

The compare functions as coded above are not useful for general usage, as they do not expect pointers to structures of s_test type. You could do the same trick as you did for strcmp. That is, you could code the functions as:

compare_test_struct_for_qsort(const void *one, const void *two) 
   { 
   return compare_test_struct_normal( 
      (struct s_test *) one,
      (struct s_test *) two);
   }

compare_test_struct_normal(struct s_test *one, struct s_test *two)
   {
   if ( one->member > two->member)
     return 1;
. . .
}
Then if you wanted to use the comparison function in a normal mode, you would code:

struct s_test test_a, test_b;

compare_test_struct_normal (&test_a, &test_b);
And if you wanted to call qsort, you would use:

qsort(struct_array, 10, sizeof(struct s_test),
   compare_test_struct_for_qsort);
However, the overhead of a double function call will cut down the efficiency of the sort.

Since it appears that using a prototype of the form:

int strcmp(const void *, const void *)
where qsort is called will eliminate the prototype error, I would opt for that solution to this tiny incongruity in the prototype system. (KP)

[All of these responses are on the money. They also cast useflight on a notoriously thorny topic. Tom Plum forced X3J11 to consider the problems surrounding the prototype for qsort at some length. We never did resolve the issue to his satisfaction.

I can't determine whether my response was incorrect. My back issues of CUJ are half a world away. I recall responding to a specific diagnostic that Mr. Irani showed. I have caught the C part of Turbo C++ out in this area, and the Gnu C compiler as well. It was easy for me to believe that Turbo C could botch type checking involving void pointer arguments. Whether or not it does, these letters address the basic issue much better than I did. Thanks. (PJP)]

Automatic Buffers

I'm writing about the sticky automatic buffer that was the subject of Doug Oliver's 9/90 CUG and John Brand's 1/91 CUG letters - used in the function repeat_format. This function returns a pointer to local storage that was not guaranteed to be valid. I was hoping someone else would point out the reason why this technique should never be used, but no one did. Hence this letter.

Using this technique will introduce some of the hardest-to-find bugs you'll ever see. The reason automatic storage isn't guaranteed valid is because an interrupt service routine (ISR) occurring at precisely the right moment will trash the buffer.

One of the reasons the bugs will be hard to find is that standard PC interrupts (e.g., the clock) seem to switch to their own stacks after using only 10 bytes or so of the user stack. Other drivers (e.g., an ethernet drive) may or may not use a local stack. Depending on the PC's hardware configuration, the buffer may be trashed seemingly at random with a probability proportional to the frequency of interrupts and the length of the delay before the user function copies the buffer to "safe" storage. Thus, the program may always work just fine on the author's machine but may develop annoying random quirks after it is distributed.

Some CPUs (e.g., 68030) have a separate stack for interrupts and are not subject to this problem. However, the operating system may reclaim the stack space for use elsewhere, causing sporadic memory protection faults.

In any case, this technique is non-portable. I suggest that it be replaced with a call to malloc or strdup.

As long as I'm writing, I'll throw in my contribution to the external variable declaration dialog (Andreas Lang, Bill Sharar II, David Hanson, Larry Leonard). Listing 7 shows a technique I've used for years.

This technique is similar to Mr. Leonard's solution. Your initialization of:

array[15] INIT( {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15} );
doesn't work in my system (And I'm assuming others. Did you compile your example?) because the preprocessor does not honor the braces around the aggregate initializer in the macro instantiation, instead assuming that the commas separate actual arguments. Compound data objects can be declared as w is in my example or using a variation of the technique you presented as Listing 2 (1/91 CUG).

Don Drantz
Eden Prairie, MN

I agree with you on the use of automatic storage. I prefer for the user to pass the address of a buffer that would be filled in with the appropriate information. Using malloc inside a function without a warning that memory might not be available is not a wise idea. Of course, one could return the address of a static buffer (even some library routines do it), but that makes multi-user libraries not feasible.

I have to admit I didn't compile my example. It was so ugly that I didn't have the heart to send it to the compiler. (KP)