Columns


Doctor C's Pointers®

Puzzles, Part 2

Rex Jaeschke


Rex Jaeschke is an independent computer consultant, author and seminar leader. He participates in both ANSI and ISO C Standards meetings and is the editor of The Journal of C Language Translation, a quaterly publication aimed at implementors of C language translation tools. Readers are encouraged to submit column topics and suggestions to Rex at 2051 Swans Neck Way, Reston, VA 22091 or via UUCP at uunet!aussie!rex or aussie!rex@uunet.uu.net.

This month I'll continue the Puzzles series, this time concentrating on arrays and subscripting. As before, I have included the answers to the problems but encourage you to try solving them before looking at the answers.

The Puzzles

1. Discuss the declaration double d[1][1][1].

2. Given the declaration int i[2][3][4], convince yourself that all the statements in Listing 1 are equivalent. They all initialize the last int in the 3D array to 1.

3. What is the type and value of the expression "abcd" [2] ?

4. Explain what is happening in the expression f()[2] = 'x'. Under what circumstances might this not work? Can you think of a use for this construct?

5. When an array is passed to a function, the address of its first element is actually passed. For the array short s[5][3], what is the type of the value actually passed in f(s)?

6. Given double d[2][4], what is the type of the expression d[1]?

7. What is the order of evaluation in the expression a [i++] [i] [-i]? Assume that i initially has the value 5.

8. How do you declare an array of 16 1-bit bit-fields?

9. Is an array name a pointer?

10. It is widely believed that an array name is always converted to the address of the first element. Are there any exceptions?

The Solutions

1. In an array declaration, the size of each dimension must be a compile-time constant integer expression having a value greater than zero. Unlike most languages, C permits a dimension of size 1. You could argue that an array of 1 element isn't an array at all. However, looking from the other perspective, a scalar is simply an array of 1 element.

In any event, d is a 3-dimensional array of 1 element in each dimension, so the total number of elements is 1x1x1 which is, of course, 1. That is, space is really allocated for a scalar double object. However, since d was declared as an array, you must use all three subscripts to access it. That is, d[0][0][0] refers to the object.

But why would you ever want to use an array of dimension 1? Consider Listing 2, taken from the setjmp.h header for TopSpeed C V1.04.

The type jmp_buf must "be an array type suitable for holding information needed to restore a calling environment." Since the elements saved do not have the same type, an array cannot be used directly in this case. Instead, an array of 1 structure is used.

Granted, few programmers will likely need to do something like this, but seeing how implementors use the language can give you ideas on applying it yourself.

2. Well, this is one of those cases where you just have to work it out a step at a time. The keys to solving this are:

a) the subscript operator [] is commutative. That is, a[i] is equivalent and interchangeable with i[a]. Both K&R and ANSI C require that one of the operands be a pointer expression and the other an integer expression. There is no requirement they be in either order.

b) Any expression of the form a [i] can be rewritten as *(a + i) and vice versa. This is the fundamental conversion identity in pointer/array expressions in C.

c) addition is commutative such that a + i is equivalent to i + a.

d) the precedence table shows that [] associates left-to-right.

3. To write predictable code, you must know the type of each expression you write. When the compiler comes across a string literal, it takes on the job of storing that string as an unnamed static char array. The array is initialized with the characters abcd and an extra trailing null character. That is, the type of the expression "abcd" is array of 5 char.

The expression "abcd" designates an array just as the name of an array does. When such an expression is used as an operand of [], it is converted to the address of the first element. Since expressions of the form a [i] can be rewritten as *(a + i), "abcd"[2] can be rewritten as *("abcd" + 2). This results in an expression of type char with value 'c'. (Interestingly, according to the rules stated in Solution 2, "abcd"[2] can also be written as 2["abcd"].)

4. Due to the fact that a [i] is equivalent to *(a + i), you can arbitrarily subscript any data pointer expression to ne level. So, for f( ) [2] to be acceptable, f( ) must have type pointer to object type T. Then, f() [2] is equivalent to *(f( ) + 2) and has type T. In the example given, type T could be any arithmetic type, although the use of a character constant might imply that T is char.

Consider Listing 3. This construct would fail at runtime if the pointer returned pointed to a const object that really was write-protected. A similar case is if the returned pointer pointed into a string literal that was stored in a read-only location (as permitted by ANSI C).

If the address returned was that of an automatic object local to function f, the result would be undefined since that object is not guaranteed to exist after f returns.

5. If you guessed that &s[0][0] (which has type short *) is passed, you are not alone since most people guess just that. Unfortunately, that is incorrect. In C, a multidimensional arrays is considered to be an array of arrays, which is shown by using separate [] punctuators and operators in multi-dimensional array declarations and expressions.

Essentially, every array in C is one dimensional. It just so happens that the elements they contain can be vectors. In any event, the first element in any array is a[0], regardless of the number of dimensions that array has. As such, what is passed to f is &s[0]. The type of s[0] is array of 3 short int and s is array of 5 elements each of which is an array of 3 short int. The type of the expression &s [0] therefore, is pointer to an array of 3 short int, which is quite different from a a pointer to short int. (I discussed pointers to arrays in my CUJ column in May 1990. The function f could be defined in either of the following ways — they are equivalent — and s can be subscripted to two levels.

void f(short s[][3])   {  /* ... */
}
void f(short (*s)[3])  {  /* ...
*/  }
6. Based on the discussion in Solution 5, d[1] designates the second element in the array d. It is the second row of 4 doubles. The type of d[1], therefore, is array of 4 double. Many people would answer that its type was pointer to double instead. However, that is not altogether correct. Expressions that designate arrays are not always converted to pointers (see Puzzle 10). One example where the conversion does not take place would be sizeof (d[1]).

7. The [] operator provides no guarantee about the order of evaluation of its operands. Since there are no sequence points in this expression, the order of evaluation is unspecified.

8. This is a trick question. You can't declare an array of 16 one-bit bit-fields. There are cases in which an array of bit-fields would be useful, but array referencing requires pointers and, therefore, addresses. Very few machines provide bit addressing, so a pointer to a bit-field would have to be larger in representation than other pointer types. On machines without native bit addressing, such emulation would probably be expensive.

[You might try

struct {
   int bit : 1;
   } a [16];
if wasted space is not an issue. Ed.]

9. In many cases, an array name behaves like a const pointer, but it really is not a pointer. It is a non-modifiable lvalue and one that often is converted to the address of an object.

10. There are three exceptions to this rule. Consider the following:

int a[10];

sizeof(a)
&a
In the first expression, sizeof determines the size of the whole array, not the size of a pointer to the first element. In the second expression, a pointer to the whole array is produced, not a pointer to the address of the first element. Many older compilers warn (or even reject) constructs like &a suggesting that the & is superfluous. Under ANSI C rules, it is not.

The third and final case has to do with string literals. For example:

char *pc = "abcd";
char c[] = "abcd";
In the first declaration, the compiler recognizes that a scalar variable is being initialized. Therefore, it stores the string as a null-terminated array of char elsewhere, and initializes pc to the address of the start of that location. That is, the expression "abcd" represents an unnamed array and is converted to the address of the first element.

On the other hand, c is an array, so the compiler recognizes that "abcd" is simply shorthand for {'a', 'b', 'c', 'd', '\0'}. It initializes the array with those characters. Here the expression "abcd" is not treated as an array, so no conversion to pointer is done. The two initializers are textually identical but are interpreted differently.