Making a deeply const pointer leads Dan deep into the subtleties of overload resolution and temporary references.
Copyright © 1999 by Dan Saks
This is the third and final article in my series on the shallowness of the const qualifier and how you can use templates to give const some depth.
Recapping Our Story
When I say that "const is shallow," I'm referring to the way const behaves when applied to any object of a class type with members of pointer or reference types. As before, I'll use code from my cross-reference generator program, xr, to illustrate this behavior. (See "C++ Theory and Practice: Partitioning with Classes," CUJ, February 1999 and "C++ Theory and Practice: Trimming Excess Fat," CUJ, March 1999 for background on xr.)
xr represents a cross-reference table as an object of class cross_reference_table, which in turn, implements the table as a binary tree. Each node in the tree is an object of the nested type tree_node. Each cross_reference_table object keeps the address of its tree's root node in a private data member named root, as sketched here:
class cross_reference_table { public: ... private: struct tree_node; ... tree_node *root; };tree_node's definition appears outside the cross_reference_table class as:
struct cross_reference_table::tree_node { char *word; list_node *first; list_node *last; tree_node *left; tree_node *right; };Class cross_reference_table also has a static member function named put_tree, which writes the contents of a tree to standard output. put_tree's declaration is:
void put_tree(tree_node const *t);Parameter t points to the root node of a tree that put_tree shouldn't alter. That's why const is where it is.
put_tree should treat every node in the tree, not just the root node, as if it were const. If you're careful, you can write put_tree so that it treats every node as if it were const. Unfortunately, the compiler can't alert you if put_tree ever treats any part of the tree as non-const.
The problem is that an object declared as:
tree_node const *tbehaves as if its members were declared as:
char *const word; list_node *const first; list_node *const last; tree_node *const left; tree_node *const right;The effect is that, although every pointer member is const, they point to objects that are not const. Thus, t points to a tree in which only the root node is const. const is "shallow." (See "C++ Theory and Practice: Thinking Deeply," CUJ, April 1999 for a more thorough explanation of this behavior.)
In this situation and many others like it, const should be "deep." put_tree's parameter t should behave as if it points to a struct whose members were declared as:
char const *const word; list_node const *const first; list_node const *const last; tree_node const *const left; tree_node const *const right;If the members were declared this way, then the members of each node would be const, and each pointer member would point to other objects that are also const. Any tree composed of these nodes would be const from top to bottom. const would be "deep."
A "Deep Pointer" Template
In my last installment, I introduced a template for a class called deep_pointer that provides the desired "deep" behavior for const pointers. (See "C++ Theory and Practice: Thinking Deeper," CUJ, May 1999.) Now I'm going to add a number of refinements to that template.
Ideally, a deep_pointer<T> should behave exactly like a genuine "pointer to T", except that a deep_pointer<T> preserves deep constness. In other words, an object declared as:
deep_pointer<T> p;should behave just like an object declared as:
T *p;However, an object declared as:
deep_pointer<T> const p;should behave exactly like an object declared as:
T const *const p;That's the hope, anyway. We'll see how close we can get.
Last time around, I worked through several versions of the deep_pointer class template. I left off with the version that appears as the header deep.h in Listing 1.
deep_pointer<T> has just a single private data member of type T * called actual_pointer. It also has a half dozen public member functions:
- deep_pointer<T>() is the default constructor. It does nothing, leaving the deep_pointer with an unspecified value.
- deep_pointer<T>(T *p) is a converting constructor. It initializes the deep_pointer so that it has the same value as p.
- operator T *&() is a conversion operator that converts a "deep pointer to T" into a "reference to a (genuine) pointer to T".
- operator T const *const &() const is a conversion operator that converts "const deep_pointer into T" to a "reference to a const pointer to const T".
- T *operator->() provides the -> operator for objects of type "deep pointer to T".
- T const *operator->() const provides the -> operator for objects of type "const deep pointer to T".
To be honest, the deep_pointer template in Listing 1 is not quite the same as the one I presented previously. In the previous article, I inadvertently declared the conversion operators as:
operator T *(); operator T const *() const;Each of these operators yields a pointer. Each of the conversion operators in Listing 1 yields a reference to a pointer.
Where did the &s come from? They're left over from earlier versions of the deep_pointer template. They're also part of an experiment to see if conversion operators alone can make deep pointers act like built-in pointers. Writing a pair of conversion operators is easier than defining the complete set of operators, such as += and -=, needed to make deep pointers act like built-in pointers.
As you'll see, the experiment is a failure. Not only are the &s unnecessary, but they actually get in the way. However, I don't want to dismiss them until I've given them their due.
Operator Ambiguities
In general, you should be very cautious about writing classes with both converting constructors and conversion operators that convert types in opposite directions. Opposing conversions can lead to ambiguities when the compiler tries to resolve calls to overloaded operators.
For example, the deep_pointer class template (in Listing 1) has both a converting constructor:
deep_pointer(T *p);which converts a built-in pointer to a deep pointer, and a conversion operator:
operator T *&();which converts a deep pointer to (a reference to) a built-in pointer. These opposing conversions cause ambiguities as follows.
Listing 2 contains the latest version of table.h, which defines the cross_reference_table class and its inline member functions. This version of the class declares its private member root as a deep_pointer<tree_node> rather than as a tree_node *. Listing 3 contains the corresponding version of table.cpp, which defines cross_reference_table's non-inline member functions.
The cross_reference_table member function definitions contain expressions that some compilers regard as ambiguous. One such expression is:
root = NULL;appearing in the body of cross_reference_table's default constructor (in Listing 2).
When a C++ compiler encounters this expression, it must determine whether that assignment operator is a built-in one or a user-defined one. In this case, the user-defined assignment operator is the copy assignment that the compiler generates:
deep_pointer<tree_node> &operator= (deep_pointer<tree_node> const &)The compiler can interpret the assignment in either of two ways:
1. It can apply the converting constructor to the right operand, NULL, to obtain a deep pointer, and then use the generated copy assignment to copy the null deep pointer to root. The resulting code behaves as if you had written it as:
root.operator= (deep_pointer<tree_node>(NULL));2. It can apply the conversion operator to the left operand, root, to obtain a reference to a built-in pointer, and then use the built-in assignment operator to store NULL through the reference and into the built-in pointer. The resulting code behaves as if you had written it as:
root.operator *&() = NULL;When faced with these choices, the compiler can't decide. Rather than just pick one, it complains that the expression is ambiguous and rejects the code.
The Special Case of Assignment
What I just told you about the choices a compiler must make is almost true, but not quite. It would have been true had I used other binary operators, such as ==, !=, <, or >. For example, if deep.h had declared a non-member operator such as:
template <typename T> bool operator!= (deep_pointer<T>, deep_pointer<T>);then the compiler would be unable to decide whether:
root != NULLmeans either:
operator!= (root, deep_pointer<tree_node>(NULL))or:
root.operator *&() != NULLHowever, assignment operators behave a little differently from the other binary operators. The C++ Standard (clause 13.3.1.2) says that user-defined conversions cannot be applied to the left operand of a built-in assignment. All assignment operators, including += and *= as well as plain =, have this restriction. No other operators do.
Thus, for assignment, the choice between:
root.operator= (deep_pointer<tree_node>(NULL));and:
root.operator *&() = NULL;isn't really a choice after all. The second alternative applies a user-defined conversion to the left operand of a built-in assignment, and the C++ Standard says that's not an option.
Despite what the Standard says, Microsoft's Visual C++ version 12 (as distributed with Visual Studio 6.0) apparently applies conversions to the left operand of a built-in assignment. I'm inferring this from the fact that the compiler complains that the assignment:
root = NULL;in table.h is ambiguous. It complains about similar assignments elsewhere in table.h (Listing 2) and table.cpp (Listing 3). The complaints persist even if you compile with language extensions disabled (using the -Za compile option).
Borland C++ version 5.3 (as distributed with C++Builder 3) also complains about ambiguities in the same places that Visual C++ complains about. However, it complains only with language extensions disabled (using the -A compile option). With extensions enabled, Borland C++ compiles the code without complaint. This strikes me as backwards. I interpret the Standard to mean that the assignment:
root = NULL;should compile, yet Borland C++ compiles it only when operating in non-Standard (extensions enabled) mode.
My confidence got a small boost when I found a compiler that seems to agree with me. Metrowerks C++ version 4 (as distributed with Code Warrior Pro 4) compiles the code without complaint whether or not it applies its interpretation of the C++ Standard (using the -ansi strict compile option).
I managed to placate the Borland and Microsoft compilers by defining a public assignment operator that assigns a built-in pointer to a deep pointer:
deep_pointer<T> &operator=(T *);The details of this function appear in a new version of deep.h in Listing 4.
With this assignment operator, a compiler now has three ways to interpret the assignment:
root = NULL;The original two choices:
root.operator= (deep_pointer<tree_node>(NULL)); root.operator *&() = NULL;each involve an implicit conversion for either the left or right operand. The new third choice:
root.operator=(NULL);requires no operand conversions. Here, the arguments match the function parameters exactly. Overload resolution always favors exact matches, so the third choice prevails.
Adding this assignment operator to the deep_pointer template is enough to make the Borland compiler stop complaining, yet still keep the Metrowerks compiler happy. The Microsoft compiler also stops complaining as long as you stay away from the -Za (disable extensions) option. The -Za option prompts the compiler to complain about the return type of:
cross_reference_table::tree_node * cross_reference_table::add_tree (tree_node *t, char const *w, unsigned n)in table.cpp (Listing 3). The specific complaint is that tree_node is inaccessible here. For whatever good it does to say so, the compiler is wrong about this. It shouldn't issue an error here. This just means you can't use the -Za option with Visual C++ to compile this code.
References and Temporary Objects
I also compiled the code using the EDG (Edison Design Group) C++ compiler front end version 2.41 coupled with Visual C++ as the back end.
The EDG compiler doesn't complain about my code when compiling with its default options, but it issues a very interesting warning when applying its "strict" interpretation of the C++ Standard (using the --strict compiler option).
The nature of the warning is that the return statement in:
template <typename T> inline deep_pointer<T>:: operator T const *const &() const { return actual_pointer; }is returning a reference to a local temporary object. Warnings in my code always make me sit up and take notice, this one more so than most. Here's why.
Normally, a reference binds only to an lvalue. For example,
int n; int &ri = n;is okay because n is an lvalue. That is, n designates an addressable object. On the other hand,
int &ri = 3;is an error because 3 is an rvalue. 3 does not designate an object that a reference can refer to or that a pointer can point to.
Not only must a reference bind to an lvalue, it must also bind to an lvalue with the same type as what the reference refers to. More precisely, for any type T, a "reference to T" must bind to an lvalue of type T. For example,
double &rd = n;is an error because rd must refer to a double; yet n, though an lvalue, has type int.
The exception to the rules just stated is that, for any type T, a "reference to const T" can bind to any expression (lvalue or rvalue) of type X, provided there's a conversion from X to T. In that case, the compiler generates code to convert the expression to T and then store the result in a temporary T object so that the reference has something to bind to.
For example, given:
double const &rd = 1;the compiler generates code to do the following:
1. convert 1 from int to double,
2. create a temporary double object to hold the result of the conversion, and
3. bind rd to the temporary.
The compiler also generates code to destroy the temporary object at the end of the reference's lifetime. The timing of the destruction can lead to dangling references, as in this example:
double const &f(int &n) { ... return n; }This function returns a "reference to a const double". The return statement binds the return value to the return expression using essentially the same rules as if it were declaring a reference variable:
double const &return_value = n;Since n is an int rather than double, this declaration generates a temporary double object and binds the reference to that temporary. So far, so good, but not for much longer.
Trouble occurs because the compiler generates the temporary in the context of the function body, not the context of the function call. Therefore, the compiled program will destroy the temporary as it returns from the function to the call site. The function will return a reference referring to a dead object. That'll be nasty.
Fortunately, most contemporary C++ compilers issue a diagnostic (either a warning or error) when this could happen. At least they try. Sometimes, the problem is so subtle that it eludes detection. In the case of:
template <typename T> inline deep_pointer<T>:: operator T const *const &() const { return actual_pointer; }only the EDG compiler operating in its "strict" C++ Standard mode noticed that the return expression's type differed just enough from the return type to require the use of a temporary object.
An obvious way to eliminate the warnings about returning a reference to a local temporary is to remove the reference operators from the return types in the conversion operators. That is, change the declarations of deep_pointer<T>'s conversion operators from:
operator T *&(); operator T const *const &() const;to:
operator T *(); operator T const *() const;respectively. Indeed, with this change, all four compilers accept the code without complaint.
When I dropped the & from the declaration of:
operator T const *const &() const;I also dropped the const qualifier to the immediate left of the &. Suppose I had kept that const qualifier, as in:
operator T const *const() const; ^1 ^2 ^3This function returns an rvalue of a pointer type. In this or any other function that returns a non-class type, the compiler ignores any cv-qualifiers applied to the top level of the type. In the declaration above, it's the second occurrence of const that's ignored.
Rvalues to the Left
A function call expression is an rvalue unless the function returns a reference, in which case it's an lvalue. When the conversion functions returned references, they were returning lvalues. Now they return rvalues. Since the left operand of an assignment must be an lvalue, the compiler can't apply these conversions to the left operand and therefore can't consider the built-in assignment during overload resolution. This, in turn, changes the way some compilers resolve assignment expressions involving deep pointers.
Let's look again at the assignment:
root = NULL;Before adding the explicitly-defined assignment operator to deep_pointer<T>, the compiler had to choose from two possible interpretations:
1. using a built-in assignment after converting the left operand:
root.operator *&() = NULL;2. using the generated copy assignment after converting the right operand:
root.operator= (deep_pointer<tree_node>(NULL));As explained earlier, the C++ Standard says that user-defined conversions cannot be applied to the left operand of a built-in assignment, so (1) above isn't viable. Nonetheless, some compilers consider it anyway and then stumble on an ambiguity. That ambiguity goes away when you provide another assignment operator (the one defined in Listing 4) that's a better match than either of the previous two.
When you change the conversion operators so that they no longer return references, the interpretation that uses built-in assignment changes from:
root.operator *&() = NULL;to:
root.operator *() = NULL;The latter form has a left operand that's an rvalue, which cannot be. Even those compilers that thought the conversion was viable when it yielded an lvalue apparently reject the conversion when it yields an rvalue.
So I guess we really don't need the explicitly-defined assignment operator. For the statement:
root = NULL;compilers no longer consider the built-in assignment (because the conversion on the left, which shouldn't have been done anyway, now yields an rvalue). If we discard:
deep_pointer<T> &operator=(T *);then the only interpretation left is the one using the generated copy assignment, namely,
root.operator= (deep_pointer<tree_node>(NULL));There's no longer an ambiguity.
Visual C++ seems to go along with this and it stops griping about ambiguities. So does Borland C++, except that it starts complaining about generating some other temporaries. I think those generated temporaries are harmless, but I hate to ignore warnings. Thus, at Borland's behest, I've elected not to remove the explicitly-defined assignment operator from deep_pointer<T> after all.
That's All for Now
The deep_pointer class template in Listing 4 is still only a start. It needs definitions for many more operators to fully reproduce the behavior of built-in pointers. Syntactically, the volatile qualifier can appear anywhere that const can appear. It might be that deep pointers should preserve deep volatility as well as deep constness. I believe a really complete implementation also needs to use member templates, which few compilers support yet.
I examined the code generated for the xr program using deep_pointers and compared it with code for an earlier version of the program using built-in pointers. I tested the code on a couple of compilers with different levels of optimization. deep_pointers appear to introduce no speed or space penalties, probably because all the member functions are extremely terse inline functions. deep_pointers just provide better compile- time type checking than built-in pointers.
Although deep_pointer<T> needs more work, I think I'll stop here for now. I want to get back to other issues regarding class design and programming technique. I will continue to mine the cross-reference generator for examples. However, I will use deep pointers as appropriate, and augment the deep_pointer template as needed to keep deep pointers looking like built-in pointers.
For those of you who'd like to try using deep pointers in practical situations, I have posted a more complete header on CUJ's ftp site. (See p. 3 for downloading instructions.) I invite you to download it and experiment with it, but I caution you that parts of it haven't been tested yet. I admit that I'm hoping some of you will do that testing for me. Maybe together we can develop an industrial-strength deep_pointer template.
Dan Saks is the president of Saks & Associates, which offers training and consulting in C++ and C. He is active in C++ standards, having served nearly seven years as secretary of the ANSI and ISO C++ standards committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield, OH 45504-4906 USA, by phone at +1-937-324-3601, or electronically at dsaks@wittenberg.edu.