May 1990/Dr. C's Pointers

Columns

Dr. C's Pointers®

Pointers To Arrays

Rex Jaeschke

Rex Jaeschke is an independent computer consultant, author and seminar leader. He participates in both ANSI and ISO C Standards meetings and is the editor of The Journal of C Language Translation, a quarterly publication aimed at implementers of C language translation tools. Readers are encouraged to submit column topics and suggestions to Rex at 2051 Swans Neck Way, Reston, VA, 22091 or via UUCP at uunet!aussie!rex.
The address of an array is the same as the address of the array's first element, right? Well, yes, I guess so, but a pointer to an array is definitely not the same as a pointer to the array's first element.
Pointers to arrays have long been supported by C; it's just that they are not often used, at least not directly. Pointers to arrays do, however, come into play when using multi-dimensional arrays, but usually so covertly that their role generally goes unnoticed. In my experience most seasoned C programmers either have no idea what pointers to arrays are or deny ever using them. Nonetheless, pointers to arrays are part of the language and they can be useful.

Getting Started
A pointer to an array is tricky to declare because you need grouping parentheses in the declaration, something rarely seen and then usually only in pointers to functions. (Of course, redundant grouping parentheses can exist in any declaration, but in this case they are not redundant.)
In Listing 1 p2 is a pointer to an array of three chars while p3 is a pointer to an array of six chars. Without the grouping parentheses both declarations would produce an array of pointers, something quite different. As you should expect, the type of the expression *p2 is array of 3 char, hence the result 3. Similarly for *p3 and 6.
Given an array A, the address of that array is the same memory location as the address of A[0]. That is, a pointer to array A points to the same location as does a pointer to A[0]. How are these pointers different? When you perform arithmetic operations on a pointer, the integer offset is scaled by the size of the underlying object. Therefore, the expressions
p1 + 1
p2 + 1
p3 + 1
are quite different. p1 + 1 points to the char one beyond p1.p2 + 1 points to the array of three char, one object (that is, three char) beyond p2. And of course p3 + 1 points to the array of six char, six chars beyond p3. Similarly, the following expressions all have different type:
*p1     /* char      */
*p2     /* char [3]  */
*p3     /* char [6]  */
And since subscripting can be written in terms of indirection and integer arithmetic, the expressions p1[i], p2[i], and p3[i] are also quite different.

Naturally Occurring Array Pointers
I stated above that pointers to arrays come in to play when multidimensional arrays are used. In the example of Listing 2 a two-dimensional array is passed to a function f.
Now we know that arrays in C are always passed by address but what is the type of the expression a in f(a)? When pressed for an answer most people guess that f(a) is equivalent to f(&a[0][0]) so the type must be long *. That is, in fact, wrong. To understand the error, we need to look at how multi-dimensional arrays are referenced.
The expression a[i][j] can be written as (a[i])[j] since the [] operator associates left to right. Specifically, a can only be subscripted to one level. The result of that subscript, however, can also be subscripted to one level. We do not directly subscript a to two levels as it may appear on the surface. To be able to legitimately use the second subscript, the type of a[i] must be a pointer type since only pointer type expressions can be subscripted.
Let's consider the type long *. Subscripting this to one level gives the type long which cannot be subscripted further. Considering the type long **, this can indeed be subscripted to two levels, however, what about the scaling factor for each row?
The answer is, the type of a in f(a) is long (*)[5]. That is, the expression passed in by value to f is a pointer to an array of five long ints. In fact, it is a pointer to the first vector of five longs (the first row) in the multi-dimensional array. (As we know, multi-dimensional arrays in C are stored in row-major order.)
Returning then to the notion that an expression designating an array is converted to the address of its first element, this does hold true in this case. The problem is, however, that a is not a two-dimensional array; it's a one-dimensional array whose elements are vectors. (I admit that this sounds like hair splitting but I find it a useful concept when dealing which such expressions.) f(a) then, is actually equivalent to f(&a[0]). Since a[0] has type array of 5 longs, &a[0] has type pointer to array of 5 longs.
In Listing 2, pointers to arrays are never overtly declared although one is implied by the definition of function f. The argument list could have been written in either of the following two forms:
void f(long a[][5]) {...}
void f(long (*a)[5]) {...}
When used in the context of formal parameters, these declarations are equivalent.

Subscripting Array Pointers
In the expression pd = d + 1 (Listing 3) the subexpression d is converted to the address of the first element of the array, that is, to &d[0]. And since d is a two-dimensional array, the type is a pointer to an array of two double. As a result, pd points to the second row of the array d. By subscripting pd to two levels, we can access d as though its row numbers were -1, 0, and 1 as shown.

Array Pointers And Casts
Just as you can cast a pointer to int to a pointer to char, you can cast a pointer to one size array to a pointer to another size array. For example:

main () { char *pc1 = "abcdefghi"; /* OK */ /*4*/ char (*pc2a)[3] = pc1; /* error */ char (*pc2b)[3] = (char (*)[3])pc1; /* OK */ /*6*/ char (*pc3a)[6] = "abcdefghi"; /* error */ char (*pc3b)[6] = (char (*)[6])"abcdefghi"; /* OK */ }
In strict ANSI mode, lines four and six should be diagnosed since the two pointer types are not assignment compatible. Note the parentheses around the * in the casts. These are necessary, for without them the cast type would be an array of pointers and you cannot cast anything into an array.

Returning An Array Pointer
Due to the symmetry of C's typing mechanism, you can return pretty much any type object that you can declare. (The only exceptions are arrays and functions.) As such, it is possible to return a pointer to an array. (See Listing 4. )
The expression (*f())[0][1] looks rather strange, but on close inspection is quite sensible. Function f returns a pointer to an array of three pointers to char. Therefore, *f() is the array of three pointers. (*f())[i] references the i^th element of the array of three pointers to char and (*f())[i][j] references the j^th char offset from the i^th pointer.
Prior to ANSI C, statements of the form return ≈ were typically accepted by compilers, but the ampersand was regarded as superfluous. In fact, quite a few compilers produced a message saying so. However, ANSI C, says that &array is not the same as &array[0]. Specifically, an lvalue expression that designates an array is converted to a pointer to the first element in all cases except when preceded by the unary & operator, when used as the operand of sizeof, and when the expression is a string literal being used in the initializer of an array.

A Real Need For Array Pointers
The following example shows a situation where you must use pointers to arrays. It involves the dynamic allocation of a multidimensional array.

#include <stdlib.h> main() { int (*p)[3][4]; p = malloc(2 * 3 * 4 * sizeof(int)); p[0][0][0] = 1; p[1][2][3] = 1; }
An array of type int [2][3][4] is allocated and the resultant pointer type is a pointer to an array 3 of array 4 of ints. To use the space for an array of type int [4][3][2], the pointer must be declared as int (*p)[3][2].