Constness isn't always as pervasive as you might like. Sometimes you have to work to hold onto it.
Copyright © 1999 by Dan Saks
One common use for the const qualifier is to constrain the effects of functions. Specifically, any function that has a parameter of pointer or reference type can use that parameter to alter as well as inspect the actual argument passed as that parameter. Declaring the parameter with type "pointer to const" or "reference to const" prevents the function from using that parameter to alter the actual argument.
The keyword const in a declaration such as:
T const *p;is, in a sense, a promise by the program that it won't use any values obtained directly or indirectly from p to alter the value of any T objects. That is, the program promises that it won't store into *p, nor copy p to any other pointer q and then store into *q.
The const qualifier usually follows through on its promise. For example, the declaration of the standard strcmp function:
int strcmp (char const *s1, char const *s2);carries with it a promise that strcmp won't alter any of the characters pointed to by either s1 or s2. The compiler backs up this promise by checking every use of s1 and s2 in the body of strcmp to make sure they comply.
Unfortunately, the const qualifier sometimes falls short of expectations. For example, the cross-reference program that I've been working on has a class called cross_reference_table with a member function called put_tree declared as:
void put_tree(tree_node const *t);put_tree displays the contents of a binary tree representing a cross-reference table. The const qualifier in the function declaration seems to suggest that put_tree doesn't modify the tree. However, it doesn't really promise that much.
put_tree's declaration promises only that put_tree won't modify the root node of the tree, leaving put_tree the right to alter all other nodes in the tree. I wrote put_tree so that it never exercises that right, but it retains that right nonetheless. Had I inadvertently written put_tree so that it altered tree nodes other than the root node, the compiler would have been unable to detect the error.
I described this problem several months ago. (See "C++ Theory and Practice: const in Parameter Lists," CUJ, September 1998.) At the time, I didn't have any constructive suggestions to make other than to suggest by example that you should program around this carefully. (That, of course, is the cure to all programming problems.)
Now I think I have a concrete solution to this problem. I have not tested this approach very thoroughly, but I thought I'd share it with you anyway because I think it's pretty neat. My hope is that you will provide feedback that will either improve the technique or show why it isn't very viable.
const is Shallow
Before I show you the solution, let's make sure you understand the problem. The problem originates with the meaning of const-qualified class objects.
Suppose T is a class type defined as:
struct T { int m; int n; };(C++ considers structs and unions to be classes.) An object of type "const T" behaves as if every one of its members is itself const. Thus, when you declare an object v as:
T const v;the compiler treats each member of v as a const object. In other words, v acts very much as if it you had defined another type:
struct const_T { int const m int const n; };and then defined v as:
const_T v;Thus, v.m and v.n both have type "const int," and the compiler will reject any attempt to store into v.m or v.n, as in:
v.m = 0; // error ++v.n; // errorThe compiler will even reject any attempt to refer to v.m or v.n as if they were non-const int, such as:
int *p = &v.m; // errorSo far, this is all well and good. The problem sets in when the class has data members that are pointers, as in the case of tree_node in my cross-reference example.
The relevant code from the cross-reference program appears in Listings 1 and 2. The cross_reference_table class and inline member function definitions appear in table.h (Listing 1), and the corresponding non-inline member definitions appear in table.cpp (Listing 2). (See "C++ Theory and Practice: Trimming Excess Fat," CUJ, March 1999 for background on these listings.)
Let's look again at member put_tree of the cross_reference_table class, declared in Listing 1 as:
void put_tree(tree_node const *t);Parameter t points to the root node of a binary tree representing a cross-reference table, which put_tree writes to standard output. put_tree shouldn't alter the tree in the process of writing it out. Therefore I added a const qualifier so that t has type "pointer to const tree_node." I was hoping to enlist the compiler's support in assuring that put_tree doesn't alter the tree. Unfortunately, the compiler can't be much help here because tree_node has pointer members, which poke holes in the armor.
tree_node is a member of class cross_reference_table defined in Listing 2 as:
struct cross_reference_table::tree_node { char *word; list_node *first, *last; tree_node *left, *right; };I think the following discussion will be a little clearer if we ignore the class name for the time being (writing cross_reference_table::tree_node as just tree_node), and rewrite tree_node with each member defined on a separate line:
struct tree_node { char *word; list_node *first; list_node *last; tree_node *left; tree_node *right; };put_tree's parameter t has type "pointer to const tree_node" so that, inside put_tree, *t has type "const tree_node." Once again, an object of a const class type behaves as if every one of its members is in turn const. In other words, t behaves much as if you had declared another type:
struct const_tree_node { char *const word; list_node *const first; list_node *const last; tree_node *const left; tree_node *const right; };and then declared t as:
const_tree_node *tIn an object of type "const tree_node," the declaration for member word is:
char *const word;so that word has type "const pointer to char." Although the pointer is const, it points to a char that is non-const (or to the first element in an array of chars that are non-const). Each of the other pointer members has a similar interpretation. In short, each pointer member of a const class object is a "const pointer," not a "pointer to const."
Inside the body of put_tree, you can't change the value of t->word. For example,
t->word = NULL; // erroris an error because t->word is const. This much is good. However, you can change the value of the object that t->word points to. For example, the compiler will accept expressions such as:
*(t->word) = '\0';or:
strcpy(t->word, "gotcha!");This isn't the behavior I was hoping for. I would prefer that the compiler flag these as errors.
The other pointer members of a const tree_node are equally capable of wreaking havoc. For example, although put_tree cannot change the value of t->first, it can use t->first to modify any list_node that it points to.
To avoid such mishaps, I wrote put_tree so that it copies t->first into local variable p declared as:
list_node const *p;and uses p to visit the list_nodes. Since p has type "pointer to const list_node," put_tree cannot use p to modify any list_nodes. C++ left me on my own to get this right. Had I declared p without the const qualifier, the code would still compile, leaving a greater potential for mishaps.
Here then is the crux of the problem. When I declared put_tree's parameter t as "pointer to const tree_node," I wanted to assert that put_tree wouldn't change any part of the tree whose root is t. Unfortunately, put_tree's declaration asserts only that it won't change the members of *t. put_tree can still change the characters in the array that t->word points to, as well as the nodes in the list that t->first and t->last point to. It can also modify anything in the sub-trees that t->left and t->right point to. Not much is const after all.
In effect, const is shallow. When you apply the const qualifier to a pointer type, the pointer becomes const, but what it points to does not. This saddens me deeply. In this, and many other similar situations, I want const to be deep. I want put_tree's parameter t to behave much as if it pointed to an object of type const_tree_node, defined as:
struct const_tree_node { char const *const word; list_node const *const first; list_node const *const last; tree_node const *const left; tree_node const *const right; };A tree composed of these nodes is const from top to bottom.
By the way, const is shallow with respect to reference types as well. For example, given:
typedef T &reference;then declaring either:
reference const r = v;is equivalent to declaring:
T &r = v;Where did the const go? A reference acts like a "const pointer" (not a "pointer to const") in that, once you bind a reference to an object, you cannot change that binding. Since a reference is already const in a sense, you cannot write declarations such as:
T &const r = v; // errorThe syntax of C++ does not allow const to the right of an & operator in a declarator.
A declaration such as
reference const r = v;is not a syntax error. The const is simply redundant, so the compiler ignores it.
In short, applying const to a reference type yields a "const reference," but since a reference is already const, it yields just a reference. It does not yield a "reference to const." Again, const is shallow.
Enforcing Deep Constness
Until recently, I thought I had no choice but to accept that const is shallow and program as carefully as I could around it. Then I realized I could enforce the semantics I want by using member functions.
For example, tree_node has a data member:
char *word;As a member of a non-const tree_node, word acts just as it's declared, and that's just fine by me. However, in a const tree_node, word acts as if it were declared as:
char *const word;but I want it to act as if it were:
char const *const word;I can approximate this behavior for data member word by using a pair of overloaded member functions named word instead. One is a const member function, and the other is a non-const member function. Everywhere that I wrote t->word, I can just write t->word() instead. For example, put_tree in Listing 2 contains the statement:
printf("%12s:", t->word);which becomes:
printf("%12s:", t->word());The details are as follows. I can rewrite:
struct tree_node { char *word; ... };as:
struct tree_node { public: char const *const &word() const; char *&word(); ... private: char *_word; };Since a class can't have both a data member and a member function with the same name, I renamed the data member as _word (with a leading underscore [1]). And, as a reminder, tree_node is really a member of class cross_reference_table, but I'm leaving that detail out of the code to keep things more concise.
The definition for the const member function is:
char const *const & tree_node::word() const { return _word; }Since this is a const member function, its implied this parameter has type "const pointer to const tree_node" and so the pointer member (now called _word) has type "const pointer to char." Again, I want the pointer to behave as if it has type "const pointer to const char." That is exactly the way the return value of this function behaves.
In general, an expression of type "reference to T" behaves just like an object of type T. The function above returns a reference to the type I want. Therefore, a call to that function acts like an object of the type I want.
For instance, put_tree's parameter t has type "pointer to const tree_node." In this context, the call expression t->word() invokes the const member function word, which returns a "reference to const pointer to const char." Therefore, the expression t->word() acts just like an object of type "const pointer to const char." Bingo.
The definitions for the non-const member is simply:
char *&tree_node::word() { return _word; }Since this is a non-const member function, its implied this parameter has type "const pointer to tree_node" and so _word has type "pointer to char." The function returns a "reference to pointer to char," and so an expression that calls this function acts just the pointer itself.
For example, add_tree's parameter t has type "pointer to tree_node." In this context, t->word() invokes the non-const member function, which returns a "reference to pointer to char." Once again, t->word() acts just like an object of the desired type, which in this case is "pointer to char."
Why did I bother writing a non-const version of this member function? Why not just make _word public? The problem is that, for functions such as put_tree that deal with const tree_node objects, the data member _word acts like a shallow const pointer. I want const tree_nodes to be deeply const. Therefore, I don't want put_tree to accidentally use _word instead of calling the const member function word, and therefore I declared _word private.
For functions such as add_tree that deal with non-const tree_node objects, _word has the appropriate type. There's no real harm in letting add_tree access _word directly, except that _word is private for the reason just given. Therefore, I provided the non-const member function word so that functions such as add_tree can access the pointer. This non-const member presents no danger to functions such as put_tree, because you cannot apply a non-const member to a const object.
Connecting All the Dots
The tree_node struct in Listing 2 has five pointers altogether: word, first, last, left, and right. Since I want const tree_node objects to be deeply const, I really must rewrite all of these pointers as pairs of overloaded functions. Then I must rewrite every reference to the data member word as the function call word(). first becomes first(), last becomes last(), and so on.
first and last point to list_node objects, which in turn, have pointer members named next. I think const list_node objects should also be deeply const. Therefore I must rewrite list_node::next as a pair of overloaded functions, and change every reference to next into a call to next().
The result of all this rewriting appears in a new version of table.cpp in Listing 3. This coding style works in that it correctly enforces the deep constness of const tree_node objects, but boy oh boy is it tedious and obtrusive. It was an interesting first attempt, but I can't imagine doing this on any scale.
Well, C++ has a pretty good cure for tedium templates. Next time around, I'll show you how you can dramatically simplify the definition of deeply const obects by using a template for deeply const pointers.
Notes
[1] Elsewhere in this issue, Bobby Schmidt recommends against use of a single leading underscore with variable names. That's because in C and C++, names that begin with an underscore and have external linkage are reserved for use by the translator. However, class data members have no linkage, so the use of a single leading underscore for class data member names is valid. mb
Dan Saks is the president of Saks & Associates, which offers training and consulting in C++ and C. He is active in C++ standards, having served nearly seven years as secretary of the ANSI and ISO C++ standards committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield, OH 45504-4906 USA, by phone at +1-937-324-3601, or electronically at dsaks@wittenberg.edu.