October 1990/Doctor C's Pointers®

Columns

Doctor C's Pointers®

Puzzles, Part 1

Rex Jaeschke

Rex Jaeschke is an independent computer consultant, author and seminar leader. He participates in both ANSI and ISO C Standards meetings and is the editor of The Journal of C Language Translation, a quarterly publication aimed at implementors of C language translation tools. Readers are encouraged to submit column topics and suggestions to Rex at 2051 Swans Neck Way, Reston, VA 22091 or via UUCP at uunet!aussie!rex or aussie!rex@uunet.uu.net.
If you have a solid background in one of various high-level languages, you can come to grips with much of C relatively quickly. However, there are quite a few "dark corners" of the language, preprocessor and library. In this issue, I'll begin a series of articles that examines a few of these darker corners involving functions. I've included the answers to the problems presented, but as a check to see how you are doing try to answer them on your own. (Even if you are a seasoned assembly language programmer, C still has enough idiosyncrasies of its own that you must master.)
Another reason for my presenting these puzzles is that they provide lots of interesting but unrelated bits of information. As such, they don't necessarily warrant a column of their own.

The Puzzles
1. If there is no function prototype in scope when a function is called, what does the compiler do? Why is it a bad idea to omit function declarations when those functions are called?
2. What is the order of evaluation in the expression a() - b() * c()?
3. Try compiling the following program. Explain why it runs without error.

#include <stdio.h> main() { /*1*/ printf("Hello\n"); /*2*/ (printf)("Hello\n"); /*3*/ (*printf)("Hello\n"); /*4*/ (**printf)("Hello\n"); /*5*/ (***printf)("Hello\n"); }
4. Given the declaration int i, j = 2;, describe everything you can about expression i = (*v[++j]) (j).
5. When function f is called, what is the type of each argument actually passed to it? How must the function f be defined for the linkage to work reliably?

void f(char, short, ...); char c; short s; float f; f(c, s, f);
6. Given int f(void);, what is the value of the following expressions?

sizeof(f) sizeof((f)) sizeof(f()) sizeof(&f)

The Solutions
1. When the compiler comes across a call to a function for which there is no function prototype in scope, it assumes that function has a return type of int and generates code accordingly. If the function actually has some other return type the behavior is undefined. The program might work, for example, where a long was returned and types long and int have the same representation. However, if an eight-byte double were returned and an int has four bytes, the return value would be misinterpreted.
Since the compiler sees no prototype, it has no information about the number and type of arguments expected. As such, it cannot diagnose invalid argument lists nor implicitly convert argument types as appropriate. Furthermore, if the function actually expects a variable number of arguments, ANSI C states that a call to it without a prototype (containing the ellipses notation) in scope, results in undefined behavior.
Any arguments having "narrow" type (char, short, or float) are converted to their "wide" equivalents (int, int, or double, respectively.) Of course, unsigned narrow types are converted to their unsigned wide equivalents.
A few compilers and static analysis checkers (i.e., lint) issue an informational message when an undeclared function is called. This is a valuable capability, and it usually points out the omission of the corresponding #include directive, a problem that would otherwise go undetected and potentially require hours of debugging.
2. There is considerable misunderstanding about operator precedence and order of evaluation. The vast majority of C programmers I have come across think the two aspects are related if not the same. They are NOT!
Regarding precedence, by inserting the implied grouping parentheses, the expression a() - b() * c() becomes a() - (b() * c()). That is, the multiplication operator has higher precedence than the subtraction. However, this simply tells the compiler how to group subexpressions when it is building its parse tree. Specifically, it provides no information as to the order in which the three functions are called. That is up to the order of evaluation, a property specific to each operator.
In short, only five operators (&&, | |, ?:, (), and comma) give any guarantee about order of evaluation of their operands, and multiplication and subtraction are not in that set. For all other operators the order of evaluation of their operands is undefined. That is, the order need not be documented and it need not be reproducible even in different places in the same source file during the same compilation.
The bottom line is that the order in which the three functions are called is undefined. In fact, unless the functions somehow interact with each other via global data or pointers to local data, their order of calling won't affect you and that is by far the most common case.
Adding grouping parentheses to an expression can never change the evaluation order of subexpressions.
3. All of the calls to printf are equivalent. Here's why. The function call operator requires a postfix expression to precede it. This expression must denote the function to be called. In the first case, this expression is simply printf, the function's name. In case 2 we have the same situation since a parenthesized expression has the same type and value as the unparenthesized expression.
An expression that designates a function is converted to the address of that function except when it is the operand of the function call operator (amongst other places). In cases 3-5, this expression is the operand of the indirection operator and, therefore, it is converted to a function pointer. Then, that pointer is dereferenced giving a function designator expression. And in cases 4 and 5, this expression is again converted to a pointer which is again dereferenced, etc. In all cases we eventually finish up with an expression that designates a function and is the operand of the function call operator resulting in that function being called.
4. The expression (...)(j) is a call to some function with one argument. And although an int expression is passed in, without seeing the prototype for that function we don't know if the int will be passed as is or converted to something like a long, for example. Similarly, we don't know the function's return type but for the example to compile, the return type must be assignment-compatible with int since that's the type of the object it's being assigned to. (This requires the return type to be one of the arithmetic types.)
The expression *v[++j] designates the function to be called. Therefore, v[++j] is a pointer to that function and v either points to an element of an array of functions or is the name of such an array. Since j uses the prefix ++ operator, the function called is that pointed to by v[3]. However, the value of the argument passed to that function is undefined. (It would either be 2 or 3.) The order in which a function's arguments are evaluated is undefined. And whether the function designating expression is evaluated before or after the argument list (or even in between arguments), is also undefined.
On the other hand, the expression (*v[j]) (++j) always results in 3 being passed to the function, however, it is undefined as to which function is actually being called — it could be the one pointed to by either v[2] or v[3].
5. With the introduction of function prototypes, ANSI C provides a way for compilers to deal with narrow types directly, without widening. In this example, the prototype void f(char, short, ...); indicates that given narrow types for the first two arguments, the compiler is permitted to not widen them as traditionally required. That is, it is up the implementor whether they are widened or not. ANSI C does not require that they be kept narrow. A compiler may chose to widen both, widen neither or widen one but not the other. As long as you define the function exactly the same using the new definition style (including the ellipses), you are guaranteed the caller and callee will agree. Whatever the compiler's strategy, it must be the same for compiling the prototype and the corresponding function definition. As such, the strategy is transparent to the programmer. (Of course, you must find out what that strategy is if you are writing the called function in some language other than C.)
In the case of the third and subsequent arguments, ANSI C requires they be widened (without exception) as defined by K&R.
See what your compiler's strategy is in this case. Is it documented or did you have to work it out for yourself?
6. The sizeof operator computes the size of an object of a given type. Since it requires an operand (either an expression or type name) having an object type, it cannot handle function or incomplete types. As such, sizeof(f) is invalid just as sizeof(int ()) is, assuming the declaration int f();. Similarly, sizeof ((f)) is also invalid since a parenthesized expression has the same type and value as the unparenthesized expression.
In Solution 3, we learned there are instances in which function designating expressions are not converted to function pointers. One such situation is when they are the operand of the sizeof operator. Therefore, sizeof(f) does not produce the size of a pointer to such a function, as some substandard compilers have reported in the past.
In sizeof(f()), the size of the function's return type is determined. This is possible provided the return type is not incomplete (as is the case with void). The function, could however, return a pointer to an incomplete type.
In the final case, sizeof(&f), the type of the operand is pointer to function returning some object or incomplete type, and that is permitted. However, when I tested this, several compilers, including some claiming ANSI C conformance, incorrectly rejected this expression apparently because they treat the & as being redundant. As far as I can tell, it always is except in this one case.
Next month we'll continue with this series by looking at arrays and subscripting.