September 1992/Q & A

Columns

Q & A

Pointers to Functions and Double Pointers

Ken Pugh

Kenneth Pugh, a principal in Pugh-Killeen Associates, teaches C language courses for corporations. He is the author of C Language for Programmers and All On C, and was a member on the ANSI C committee. He also does custom C programming for communications, graphics, image databases, and hypertext. His address is 4201 University Dr., Suite 102, Durham, NC 27707. You may fax questions for Ken to (919) 489-5239. Ken also receives email at kpugh@dukemvs.ac.duke.edu (Internet).
For those on the West Coast, I'll be speaking at C++ in Action, September 21-25, at the Santa Clara Marriott. P.J. Plauger will be a keynote speaker.
Q
I have run into a problem using an array of pointers to functions. The code in Listing 1 generates the error message size of structure array not known when compiled with TC2.0.
How can I access the two other functions in array[] without using the notation ptr=array[i]; ? Why is it necessary to assign ptr=array[0]; rather than ptr=array;?
Per Borgvad
Norway
A
I'll take a simpler example which will show what the problem is. Let's use char pointers. Consider the code
char *char_pointer_array[] = {
"ABC", "DEF", "GEH"}; char
*char_pointer;
char_pointer_array is an array of char pointers. The name itself becomes, in most expressions, a pointer to char pointers (i.e. a double pointer). An element of char_pointer_array is a char pointer. You can assign
char_pointer =
char_pointer_array [0];
The value of *char_pointer is A, a char. If you increment it
char_pointer++;
then the value of *char_pointer would be B, the next char in the string. You cannot assign
char_pointer : char_pointer_array;
since that would be attempting to assign a double pointer to a single pointer. If you had a variable as
char **pointer_to_char_pointer;
and you assign
pointer_to_char_pointer = char_pointer_array;
then its value is the address of char_pointer_array[0]. The value of *pointer_to_char_pointer is the address of the string ABC. The value of **pointer_to_char_pointer is A, a char. If you increment it as
pointer_to_char_pointer++;
then it contains the address of char_pointer_array[1]. The value of *pointer_to_char_pointer is the address of the string DEF. The value of **pointer_to_char_pointer is A, a char.
In your question, array is an array of pointers to functions and ptr is a pointer to a function. You can make the assignment
ptr = array[0];
because they are both pointers to functions. Of course, you could also make the assignment
ptr = dummy1;
The one operation you cannot perform with a pointer to function is to increment it. The reason is the size of the thing pointed to (a function) is not a fixed size, so the increment operation is meaningless.
If you have a variable as
void (**pointer_to_function_pointer) ();
you can assign the array name to it, since it becomes a pointer to function pointers, as in
pointer_to_function_pointer = array;
pointer_to_function_pointer contains the address of array[0]. *pointer_to_function_pointer is the address of dummy1. You can increment this variable, since the object it points to (pointers to functions) have fixed sizes (typically two or four bytes, depending on the compiler/processor). If you execute
pointer_to_function_pointer++;
pointer_to_function_pointer now contains the address of array[1]. *pointer_to_function_pointer is the address of dummy2.
To call the actual function (dummy2), you should use two indirection operators
(**pointer_to_function_pointer) ();
You can leave off one of the # operators in ANSI C, but I prefer to use them to keep things consistent.
As an example, if you wanted to call each function in turn, you could code it as in Listing 2.
Older C compilers do not allow you to initialize automatic aggregates (arrays and structures). However static aggregates can be initialized. I suggest using static, even with ANSI compilers, since the initialization time is less. If you really need an array initialized every time into a function, simply memcpy an initialized static array into an automatic array.
Q
The following code (Listing 3) was extracted from a program, I wrote some time ago. Everything worked fine until I recompiled the program with MS C 6.0A (originally it was compiled with MS C 5.1). Suddenly the sscanf function didn't behave as I expected it to.
All that sscanf has to do is to read two characters and three integers from an input string. The first format-specifier %*[^PR] is used to throw away any garbage that might precede the data. This works well if there is any garbage to throw away (as in string[0]). If not, the sscanf works with MS C 5.1 but it doesn't work with MS C 6.0 (the scan stops, before any data is read).
I tried different compilers and got the results shown in Table 1. Now the question: "Who is right?"
G. Prodasla
Germany
A
The standard states:
"[ matches a nonempty sequence of characters from a set of expected characters (the scanset). The corresponding argument shall be a pointer to the initial character of an array large enough to accept the sequence and a terminating null character, which will be added automatically...."
The operative work is nonempty. The string[1] input had no characters that were in the scanset. Therefore no conversion was performed. FAIL was the correct result. MSC 6.0A and Zortech C++ 3.0 are ANSI standard. Borland TC+ + 1.0 and MSC 5.1 are not.
Let me explain your format statement for the benefit of our readers. You used * in the format specifier, which is the assignment suppression flag. Therefore you did specify the pointer to hold the characters. This assignment suppression flag can be used with any of the format specifiers.
With the left bracket ([) format specifier, the left and right brackets surround a list of characters (the scanset) which are to be scanned in. Characters are read and placed in the array which was passed until a character not in the scanset is read. That is pushed back onto the input stream.
The circumflex (^), as the first character following the left bracket, reverses the logic of the scanset. The scanset contains all characters that do not appear in the format up to the matching right bracket. So [^PR] states that you want to match any characters that are not P or R.
Q
Thank you for your answer of my question from Oct. 26, 1991 in CUJ April 1992, page 105.
I did not explain my question exactly enough. Often I only need myfunction1 and myfunction2 isolated and sometimes together. So I need myfunction3 for ease of use, that is, if I need myfunction1 and myfunction2 together. In reality, I have about 10 functions, which I use sometimes isolated and sometimes together.
With your answer I now can only use myfunction1 and myfunction2 together, but not isolated (or did I see it wrong?). Thank you for your answer.
Willi Fleischer
Moerfelden, Germany
A
For the benefit of our readers, let me summarize the previous question and answer. The writer wanted to be able to call functions that had variable parameter lists (myfunction1), from within a function that had a variable parameter list (myfunction3) (see Listing 4) . The calling function could get a va_list argument from its parameter list, but could not pass it to the called function.
You will need to come up with some wrapper functions — so that each function will have both a direct way to get to it and one that expects a va_list type argument. This is only a little bit of overhead. For example, you would have two functions for myfunction1:
void myfunction1_va_list(char *format, va_list arg_ptr);
void myfunction1_direct(char *format, ... );
The latter function would look like
void myfunction1_direct(char *format, ....)
   {
   va_list arg_ptr;
   va_start(arg_ptr, format);
   my_function1_va_list(format, arg_ptr);
   va_end (arg_ptr);
   };
The function you use the most could be called myfunction1 and the other could use the suffix.
Q
I want to exceed the DOS 640K boundary. How do I do that?
George Britwell
San Francisco, CA
A
This is a simple question with a long answer. There are a number of questions to answer. First, with what do you want to exceed the boundary — code or data? Second, with what machines do you need to be compatible — 8086, 236, 386?
If you have a lot of code, there are several ways to exceed the boundary and stay compatible with all machines. You can use overlays. Most compiler/linker vendors provide some sort of overlay mechanism. It takes a little bit of work to be sure that you link your modules into the proper overlays. If you have structured your code well, this should be minor work.
You can use dynamic overlays (such as provided by RTLINK and Borland's Zoom). You do not need to specify the overlay structure as with the prior method. The linker/dynamic loader will insure that the modules are loaded when needed. For many programs, this will work about as well as the prior method. It will work much better than the prior one if you did a poor job of laying out your overlays.
Overlays work for code. The latest version of RTLINK overlays also work for static and external data, but not for allocated memory. With large data items, you might do best to keep them on a disk file and use a function to access the particular one of interest. This can be hidden from the program by clever function calls (see my book All on C for an example).
If your program is to be run on machines for which you specify the hardware, then you can exceed the boundary by using either expanded or extended memory. Expanded memory is more compatible with the early processors. Extra memory is mapped into the 1MB address space at particular segment addresses. By calling an EMS manager, you can switch which block of expanded memory is at the address at any particular time.
Expanded memory is simple to use for data items that do not need to be accessed simultaneously. It is very quick and efficient. No special version of MS-DOS is required, just some calls to interrupts. Michael Young's MS-DOS Advanced Programming (Software Engineering, Box 5068, Mill Valley, CA 94942) has information on how to access it.
Extended memory is designed to be used in 80286 and 80386 protected mode as a type of virtual memory. The segment portion of an address is treated as a virtual segment. A page table is set up that translates this virtual segment into a real segment, which is then appended to the offset. This table also includes a few bits per entry that specify the protection mode (read only, read-write).
There are several DOS extenders that can be purchased to utilize the protected mode. You compile a program using the large memory model of your current compiler. The object code can then be linked into a protected-mode executable and run using the DOS extender.
There are a lot of details involved here, including how to access real memory (e.g. the screen) from your protected-mode program. Suffice it to say that it is possible. A good review of DOS extenders is in Tech Specialist, February 1991.
Without using an extender, memory outside of the 640K boundary can be set up as a RAM disk. Microsoft supplies a RAM disk program with DOS 5.0. The disk could hold large blocks of data which don't fit into regular memory. They will be accessible much faster than if they were stored on a normal disk.
Also for data, you can purchase several packages which replace the allocation functions. The malloc family of functions uses real memory with most compilers. These packages permit allocation of memory from extended memory, expanded memory, or even disk. Where the actual data is stored is transparent to your program. I've not used one of these, but several are advertised on the pages of this magazine.
Or you could compile your program in 386 mode and get rid of the entire problem (assuming you have enough real memory). However, if you have to have compatibility with earlier processors, then this is not an option.

Big File Notes
If you send out large data files to your users, of which only a portion represents data that has been updated, you might be interested in PATCH from the Pocketsoft people. The program can analyze the differences between your old file and your new file and create a delta file. This delta file might fit on a single disk. The user has a corresponding program (for which you get royalty-free distribution rights) for using the delta to update his/her data file. (KP)

Non-ANSI =-
I would like to respond to the question from Dr.-Ing. Dieter Scharpf in the May issue, where he asks whether an ANSI-conforming compiler has the liberty to treat =- as an operator. You answered that =- is not a valid operator in ANSI C, and I agree completely. That form of compound assignment operator is considered an anachronism, used by some older versions of pre-ANSI C.
Dr.-Ing. Scharpf then implies that the C compiler on his HP 9000 Series 700 incorrectly compiled the statement x =- 1.0; as an instruction to subtract 1.0 from x. That would not be correct for an ANSI-conforming compiler, and the HP C compiler on that platform is certainly not intended to behave that way. In fact, I know of no HP C compiler that does recognize =-, except to produce a diagnostic warning that it will be treated as two separate operators. I have tried to duplicate the described misbehavior, but the compiler stubbornly generates the correct code, which is to assign -1.0 to x.
If Dr.-Ing. Scharpf can duplicate the incorrect behavior he describes, we would very much appreciate a defect report and a sample program from him. By the way, I have one comment regarding the answer that you gave. Technically, the expression x=-1 consists of four tokens, not three. The minus sign in that expression is a unary arithmetic operator, and not part of the integer constant.
Walter Murray
Mountain View, California
Contrary to Dr. Dieter Scharpf's experience, I cannot obtain the archaic non-ANSI interpretation of =- with HP9000/700, HP-UX version 8.05. I tried this with the bundled non-ANSI compiler and with the optional ANSI cc -Aa and non-ANSI compilers. All produced the modern interpretation, with no warning. Generally I have found HP C to be reasonably standard in its behavior, unlike their other languages.
Tim Prince
San Diego, CA
Thanks for your responses. I've seen warnings on other compilers, but have never run into one that compiled it wrong. (KP).