Columns


The Learning C/C++urve

Bobby Schmidt

Me And My Arrow

Lots of people want an auto-pointer template class, but agreeing on what it should do is not so easy.


Copyright © 1997 Robert H. Schmidt

I continue with variations on auto_ptr, wrought by my dissatisfaction with pointers in general and MAPI in particular. Last month I referred you to the July 1996 CUJ for auto_ptr's class definition. Upon reflection, I have decided to list the class this month, better enabling comparisons between this "base" auto_ptr version and the alternatives I'll be suggesting. The base auto_ptr appears in Listing 1.

This listing comes from the Draft Standard [1] . If you are not used to reading STL code you may find the definition a bit daunting. I certainly find a lot there that is at best tangential, and at worst distracting, to this discussion. I will therefore do us all a simplifying favor by moving the template from namespace std to global scope, and removing the throw clauses, STLish typedef, and explicit support for inheritance.

I will also add two private members that don't show up in the Standard, members representing a typical auto_ptr implementation:

I'm making the destructor the first declared function member. This is a recent stylistic change for me, one that I'm still evaluating.

Each class must have exactly one destructor, while it may have any number of constructors. In the presence of inheritance, that destructor should probably be declared virtual, while constructors can never be declared virtual. By always putting the destructor at the top, I am reminded to declare it correctly.

Also, I often find parallel structure between my constructors and assignment operators, so that declaring them contiguously, or even intermingled, gives a clearer view of a class's object copying semantics.

Since the class is no longer the real STL auto_ptr, I've changed the name to the related auto_pointer. With a few other minor tweaks, the combined effect is

template<class t>
class auto_pointer
    {
public:
    ~auto_pointer();
    explicit auto_pointer(t * = NULL);
    auto_pointer
        (const auto_pointer &);
    auto_pointer &operator=
        (const auto_pointer &);
    t& operator*() const;
    t* operator->() const;
    t* get() const;
    t* release() const;
private:
    mutable bool is_owner_;
    t *pointer_;
    };

You should find most of the public members fairly intuitive:

The last two public members — get and release — bear more explanation.

get

get returns the same thing as operator->, offering a more convenient notation for places where infix operator syntax is awkward or impossible. Consider the following statements using real pointers:

char *p = "Rock Me, Amadeus";
char *q = p;

Change the code so that p is an auto_pointer. What you'd like to write is

auto_pointer<char> p = "Rock Me, Amadeus";
char *q = p;

Unfortunately, with our current definition of auto_pointer, the line

char *q = p; // error

doesn't compile — an auto_pointer does not implicitly convert to a real pointer. You could use operator-> to solve this problem, since that function does return a real pointer. However, you must use the function form

char *q = p.operator->(); // OK

since the infix equivalent

char *q = p->; // error

won't compile. Because the functional notation is rather unaesthetic, auto_pointer provides the more normal looking

char *q = p.get(); // OK

get is a const member function, which may surprise you. Remember, a real pointer can stay const while the thing it points to can change:

char * const p = "abc";
*p = 'x'; // OK, what p points to can change
p = "x"   // error, p itself cannot change

Similarly, by yielding a non-const pointer, an auto_pointer can stay the same while the pointed-to object can change. This same reasoning explains the const-ness of operator->.

Speaking of operator->, I should give you an extra word of warning here. A real C++ pointer allows -> only when that pointer well and truly points to a class, struct, or union. When you call auto_pointer's operator-> with traditional infix notation in

auto_pointer<t> p;
p->m = 123;

p must obey these same rules; that is, it must point to something that has members. If you use the more explicit operator syntax

p.operator->();

then p doesn't have to point to something with members. Note that you can't do this with real C++ pointers:

t *p;
p.operator->(); // error

since they are not class objects, and thus don't have operator class members.

Because the compiler can't force auto_pointer to always reference something with members, it may give you a warning like "you are defining operator-> for pointers that may not be able to call it infix." I remember getting such a warning with Microsoft Visual C++ 4.x; my current Macintosh compiler gives no such warning. You may be able to use template specialization to deflect the warning, but that's a topic well beyond this series.

release

release is equivalent to a get that yields ownership of the pointed-to object. Such releasing action is implicit when you copy or assign an auto_pointer. By calling release, you make the ownership transfer explicit. This technique offers a solution to one of the problems I cited last month:

auto_pointer<char> p1 = new char;
if (true)
    auto_pointer<char> p2 = p1;
*p1 = 'a'; // run-time error

where p2's destructor deletes the pointed-to object out from under p1. If you change the code to

auto_pointer<char> p1 = new char;
if (true)
    {
    auto_pointer<char> p2 = p1;
    p2.release(); // p2 is no longer owner
    }
*p1 = 'a'; // OK

p2's destructor no longer deletes the pointed-to object. Inconviently, neither does p1's. In fact, the object is an orphan — nobody owns it, so nobody deletes it.

As a work-around, you could explicitly delete the pointed-to object:

// ... same as before
delete p1.get();

You could also hand off ownership to another auto_pointer:

// ... same as before
auto_pointer p3 = p1;
// p3's destructor will delete object

although this is an admitedly over-engineered solution. I leave it as an exercise for the student to surmise the effect of

// ... same as before
p1 = p1;

In practice, I suspect you'll most use release to hand object ownership from an auto_pointer to a real pointer:

// p1 is owner
auto_pointer<char> p1 = new char;
// p2 is now owner
char *p2 = p1.release();
// ...
// no longer deleted automatically
delete p2;

Pointers vs. auto_pointers

In their current incarnation, auto_pointers can't always mix with real pointers. As I discussed above, declarations like

auto_pointer<char> p;
char *q = p;

won't compile. The moral opposite

char *q;
auto_pointer<char> p = q;

does compile, but the variations

auto_pointer<char> p;
char *q;
p = q;

and

auto_pointer<char> p;
char *q;
q = p;

do not.

As is typically the case with any type conversion involving classes, the place to look for answers is in the single-argument constructors (a.k.a. conversion constructors) and the conversion operators. In general, if you are turning type A into type B, it must be true that either A can turn itself into a B, or B can turn an A into a B [2] . More precisely, there typically must be either a conversion operator of the general form A::operator B(), or a conversion constructor of the general form B::B(A) [3] .

In the case of moving a char * to an auto_pointer<char>, this means that either char * must have the equivalent of a conversion operator returning auto_pointer<char>, or that auto_pointer<char> must have a conversion constructor accepting a char *. The first suggestion is clearly invalid — pointers have no innate knowledge of classes and cannot turn themselves into class types. On the other hand, the second suggestion seems quite plausible, given the existence of

explicit auto_pointer(T * = 0)

in the auto_pointer type definition. This constructor allows the sequence

char *q;
auto_pointer<char> p = q;

But why then does the related sequence

char *q;
auto_pointer<char> p;
p = q;

fail?

This Program Contains Explicit Language

The answer lies in the keyword explicit. Were you to change the constructor declaration to

auto_pointer(t * = 0);

the previous example would compile. As its name suggests, explicit means that the constructor must be called explicitly, not implicitly. In the statements

char *q;
auto_pointer<char> p = q;

the second line is equivalent to

auto_pointer<char> p(q);

This rewrite more clearly shows the explicit call to the single-argument constructor. In the apparently similar construct

char *q;
auto_pointer<char> p;
p = q;

the compiler wants to turn q into an auto_pointer<char> by implicitly changing the code to

char *q;
auto_pointer<char> p;
p = auto_pointer<char>(q);

(Note how much this implicit constructor call looks like the explicit call from the previous example.) However, because of the keyword explicit attached to the constructor's declaration, the compiler is not permitted such implicit constructor calls. Were the keyword explicit removed, the above would work, since the restriction against implicit conversions would be gone.

As I have mentioned a few times in this series, MAPI loves pointers. It seems reasonable, then, that MAPI must perceive our pointer replacement to be a real pointer; otherwise, the interfacing problems could overwhelm other considerations. Such a strategy argues against the keyword explicit, since that prevents real pointers from seamlessly turning into auto_pointers.

Plays Well With Others

To make the illusion more complete, we need to support the other direction, allowing auto_pointers to seamlessly turn into pointers. In other words, any object of type auto_pointer<T> must have a conversion operator returning the T * equivalent of that object. This conversion function obviates need for the get member, which we can now remove.

In addition to accepting objects via pointer, MAPI also returns objects via pointer, requiring callers to pass in pointer addresses:

SomeMAPIObject *x;
if (ERROR(SomeMAPICall(&x)))
    {
    // ...
    }

As written, we cannot pass auto_pointer this way:

auto_pointer<SomeMAPIObject> x;
if (ERROR(SomeMAPICall(&x))) // error
    {
    // ...
    }

since the type of &x is not SomeMAPIObject * (which MAPI expects) but rather auto_pointer<SomeMAPIObject> *.

As a possible solution, we could override the & operator to return SomeMAPIObject *, but this leads to an interesting design dilemma: do we have the & operator return the real address of the object (as it would for built-in types), or do we have it return the address of the object we're trying to emulate? One strategy is more consistent with general C++ object usage, the other more interoperable with C++ pointer usage.

For this series I have elected to override the built-in & operator, reasoning that I almost never will want to take the address of a genuine auto_pointer object, but often will want to get the address of its underlying pointer. This is consistent with the earlier addition of a conversion constructor and conversion operator, further enhancing the illusion that auto_pointers act like real pointers.

To pull off the illusion correctly, we need two overloads of operator&, one for const objects, and one for non-const objects. For const objects, the returned address cannot be written to; for non-const objects, it can be written to, so that the auto_pointer object can point to something new.

The latter version should make you nervous. Why? Consider the sequence

{
auto_pointer<char> x = new object;
SomeMAPICall(&x);
}

When it's constructed, x owns the new object it points to. The call to SomeMAPICall may change what x points to, so that the original object is orphaned. When x goes out of scope, instead of deleting x's original object, the x destructor deletes the object SomeMAPICall gave it.

As I suggested last month, object ownership is not fool-proof. To effect a partial remedy, I recommend you pass the address of a pointer (real or auto_pointer) only if at least one of these conditions is true:

For auto_pointers, my preference is a test in the non-const operator&: if the auto_pointer owns a non-NULL object,

operator& throws an exception. Of course, this does nothing to help the more general problem of object ownership with real pointers. You could spend time engineering a more robust solution, say with garbage collection, but I leave that as an exercise for the student.

Combining these changes leads to the class shown in Listing 2. While I normally abhor removing type safety, I feel the engineering tradeoffs justify the change. Given the choice of real pointers, or less-than-ideal but still stronger auto_pointers, I'll take the latter wherever I can.

Point Counterpoint

Moving beyond this latest version, we face a significant design challenge: how much do we want or need auto_pointers to act like, and interact with, real pointers? Since auto_pointer behavior already is not a strict superset of real pointer behavior, perhaps we need not be too slavish in preserving all pointer-like nature. At the same time, the class requires enough pointer-like nature to supplement and enhance MAPI.

When I first started adapting auto_ptr, I decided to step back and look at first principles. That real pointers have certain features doesn't mean I find those features desirable. The general behavior of C++ pointers, which are for most practical intents C pointers, goes back a generation now. Assumptions that may have seemed airtight and rational then do not necessarily appear so now, especially when mixed in the same bag as C++ features C's creators could not have anticipated.

I believe the pointer abstraction model I'm unveiling allows good interoperability with MAPI while omitting pointer features counter to either C++'s strengths or my own design prejudices. Even so, please bear this in mind: programming is ultimately the objective manifestation of one's subjective experience. Take none of this as absolute truth. Think of it as C++'s equivalent of quantum phenomena — what I show here is but one of many possible paths.

Coming Attractions

Next month I'll restrict some undesired pointer aspects, add variants that understand MAPI's memory management, and show the implementation for auto_pointer's members.

A disturbingly large number of Diligent Readers replied to my June ponderings about syntax colo(u)ring vs. Hungarian notation. I'll summarize their thoughts in the next month or two.

Dan Saks and I make our yearly pilgrimage to the Software Development East conference in Washington DC. We'll be there from 1 - 3 October with new talks, including one we may even share. Also, Miller Freeman always gives away free copies of CUJ, just in time for trick-or-treat, so mark your calendars.

Notes

[1] ISO C++ CD (Committee Draft) 2, section 20.4.5. See my June column for URLs of sites containing this draft.

[2] Unless these are blood types, in which case you need a type O.

[3] Note for the compulsively picky: To simplify discussion, I'm purposely ignoring member variations involving references, CV qualifiers, inheritance, and so on. I am also temporarily ignoring the strategy of overloading A members to accept a B, again for simplicity.

Bobby Schmidt is a freelance writer, teacher, consultant, and programmer. He is also a member of the ANSI/ISO C standards committee, an alumnus of Microsoft, and an original "associate" of (Dan) Saks & Associates. In other career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him at 14518 104th Ave NE Bothell WA 98011; by phone at +1-425-488-7696, or via Internet e-mail as rschmidt@netcom.com.