Columns


Stepping Up To C++

The Return Types of Virtual Functions

Dan Saks


Dan Saks is the president of Saks & Associates, which offers consulting and training in C++ and C. He is secretary of the ANSI and ISO C++ committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield OH, 45504-4906, by phone at (513)324-3601, or electronically at dsaks@wittenberg.edu.

WG21+X3J16, the joint ISO ANSI C++ technical committee, is now in its fifth year of work on a standard definition for the C++ programming language and its accompanying library. Over the years, the committee has added more than a dozen new features to the language. I described several of them:

in "Recent Language Extensions to C++", CUJ, June 1993, and described one other:

This feature enhances the language's support for object-oriented programming. In particular, it extends the set of valid conversions within a type hierarchy that you can write without casting. I assume you are familiar with virtual function in C++, which I described in my last three columns (see "Virtual Functions," "How Virtual Functions Work," and "Overloading and Overriding" in CUJ, December, 1993 through February, 1994).

What the ARM Says

According to the The Annotated C++ Reference Manual (ARM) (Ellis and Stroustrup [1990]), a member function declared in a derived class D overrides a virtual function in its base class B only if that function in D has the same name and signature (sequence of parameter types) as the function it overrides. Clearly, if the name of a function, say f, declared in D differs from the name of every function declared in B, then f doesn't override anything. f's declaration adds a new name to the scope of D. If D declares a function with the same name, again say f, as one or more functions inherited from B, but D's f has a signature that differs from the signatures of every f function in B, then again D's f doesn't override anything. In this case, D's f hides all of B's f functions while in the scope of D. This is not an error, but as I explained last month, the resulting behavior might surprise you. Consequently, many C++ compilers issue a warning when this sort of hiding occurs.

What about differing return types? According to Section 10.2 of the ARM: "It is an error for a derived class function to differ from a base class virtual function in the return type only." For example, in

class B
    {
public:
    virtual int vf(int);
    ...
    };

class D : public B
    {
    ...
    void vf(int); // error
    ...
};
the declaration of D::vf is an error because it has the same name and signature as a function declared in B, but B::vF and D::vF have different return types.

Consider the consequences of allowing different return types in this situation. A function g that accepts a formal parameter of type B * or B & can apply vf to that B object, as in

void g(B *bp)
    {
    ....
    if (bp->vf(0) > 1)
       ...
    }
When compiling this function, the compiler considers only the definition for class B. None of the classes derived from B need to be declared when compiling g. Based on the static type of B::vf, the call bp->vf(0) in g should be just fine; it returns an int, as the if statement in g apparently expects.

Since D is publicly derived from B, you can pass a D * to g, as in

D d;
...
g(&d);
But now if D::vf has a void return, how can the call bp->vf(0) possibly return an int? It can't, which is why the ARM insists that an overriding virtual function must have the same return type as the function it overrides.

Cloning Objects

Some members of the standards committee suggested that this rule is more restrictive than it needs to be. There are, in fact, legitimate circumstances where the return types of the overridden and overriding functions need not be absolutely identical. Clone functions are one such circumstance.

Some applications need to be able to clone an object, that is, create an object that's an exact copy of another object. Typically, you implement a class X with a cloning function as something like

class X
    {
public:
    X *clone() const
        { return new X(*this); }
    ...
    };
Inside X::clone, *this is an expression of type X that designates the object being cloned. The expression new X(*this) allocates a new X object and initializes it using X's copy constructor (the X constructor that takes an argument of type X). clone is a const member function because it does not alter *this, and should thus apply to const as well as non-const objects.

new X returns an X *. Although you could write a clone function that returns an X or an X &, I suggest returning X * to emphasize that clone returns a dynamically-allocated object that should be deleted eventually. In general, for any class X, the return type of X::clone should be X *. For example, in a library of geometric shapes, circle::clone should return a circle * and rectangle::clone should return a rectangle *.

When used in a class that's the root of a polymorphic hierarchy, clone functions should be virtual, and often pure virtual. This lets you clone an object without knowing its exact type. For example, Listing 1 shows a polymorphic hierarchy of shapes similar to the one I presented three months ago (see "Virtual Functions", CUJ, December 1993). Classes circle, rectangle, and triangle are all derived from base class shape. shape declares a pure virtual clone function as:

virtual shape *clone() const = 0;
Each of the derived classes overrides the clone function with an impure definition. For instance, class circle declares

virtual shape *clone() const;
in the class definition, and later defines

shape *circle::clone() const
    {
    return new circle(*this);
    }
Not long ago I said that, in general, a clone function for a class X should have a return type of X *. But here the return type of circle::clone is shape *, not circle *. According to ARM, circle::clone must return a shape * because that is the return type of the function it overrides. Nonetheless, this clone function is still quite useful.

circle::clone returns the result of new circle, which is indeed a circle *. Since circle is publicly derived from shape, a circle is a shape, so the circle * in the return expression converts safely and quietly to the return type shape *. A similar conversion occurs in the clone functions for rectangle and triangle shown in Listing 1. An application can clone an arbitrary shape with code such as:

shape *s;
...
shape *cs = s->clone();
which leaves cs pointing to an object that has the same dynamic type and value as *s.

Listing 2 shows a more elaborate example dealing with collections of shapes implemented as arrays of pointers. The clone_all function replicates an entire collection of shapes. First, it allocates a new array to hold the pointers to the shape clones. Then, for each shape in the original collection, it clones that shape (using its virtual clone function) and places the pointer to the copy into the new array. As is always the case with polymorphic objects, it doesn't matter to clone_all how many different shapes there are; each shape knows how to clone itself.

Unnecessary Downcasting?

The ARM's requirement that the return type of an overriding function must be the same as the return type of the function it overrides apparently doesn't pose any problems when dealing with pointers (or references) to objects at the root of the hierarchy, as in Listing 2. However, it often necessitates using casts when dealing with pointers to objects of other types in the hierarchy. For example, you cannot write

rectangle *r;
...
rectangle *cr = r->clone();
because r->clone() returns a shape *. Even though a rectangle is a shape, a shape is not necessarily a rectangle. Therefore, you must add a cast, as in

rectangle *cr = (rectangle *)r->clone();
Casts are dangerous things. They tell the compiler to stop complaining and let you do what you want to do, or at least, what you think you want to do. But compilers are right more often than we care to admit. Casts indicate that you are doing something that is generally unsafe. Thus, you really should avoid casts in your C++ programs, probably even more so than in C programs. When casts are rare, the few casts you really do need will stand out and draw more scrutiny, which they deserve. (For more about avoiding casts, see Plum and Saks [1991].)

Of course, you don't really need a cast in the previous example, because you don't really need to call clone to replicate a shape that you know is a rectangle. Rather, you can simply clone *r with

rectangle *cr = new rectangle(*r);
But consider what happens if rectangle and triangle (and any other polygons) are derived from an abstract base class polygon derived from shape, rather than directly from shape, as outlined in Listing 3. (An abstract base class is a class with at least one pure virtual function. An abstract base class can be a derived class.)

To satisfy the ARM, all the clone functions in Listing 3 return a shape *. (The declaration of polygon's clone function is inside a comment because you don't really need it. A function declared pure virtual in a base class remains pure virtual in the derived class unless overridden with an impure declaration.) When you clone a polygon, you get a pointer of type shape *, even though you know that pointer specifically addresses a polygon. Thus, you cannot clone a polygon and copy the result to a polygon * without a cast. That is, you cannot omit the cast in

polygon *p;
...
polygon *cp = (polygon *)p->clone();
All of the casts in the previous examples cast pointers to base class objects into pointers to derived class objects. These casts are commonly called "downcasts" because most people draw class hierarchies with the base classes above their derived classes. Remember, when a class D is publicly derived from B, a D is a B, so you can safely convert a D * to a B * without a cast. But a B * is not necessarily a D *, so you can't convert a B * downward to a D * without a cast. Thus, like all other casts, downcasts are generally unsafe unless you're absolutely, positively sure that your B * actually points to a D.

But the downcast in

polygon *cp = (polygon *)p->clone();
is actually quite safe because we do know that p->clone() returns a polygon *. In fact, there's a whole family of similarly safe downcasts that occur commonly in object-oriented systems. The problem is that, on the surface, the safe casts look just like the unsafe ones. The cure is to augment the rules for virtual overriding so that you can write the safe conversions without casts.

For example, you should be able to declare each virtual clone function in a hierarchy so that it returns a pointer whose static type is the same as its dynamic type. That is, circle::clone should return a circle * and rectangle::clone should return a rectangle *, even though shape::clone returns a shape *. Then you can clone any shape, or anything derived from shape, without a cast. For instance, given

shape *s;
circle *c;
you can write

shape *cs = s->clone();
to clone a shape, and

circle *cc = c->clone();
to clone a circle.

In a sense, declaring circle::clone to return a circle * doesn't introduce any new conversions. It merely shifts the exact point where the conversions occur, or eliminates the conversions altogether. For example, when you write circle::clone as

shape *circle::clone() const
    {
    return new circle(*this);
    }
a conversion from circle * to shape * occurs as part of the return statement. Then there's no conversion at all in a calling expression like

shape *s;
...
shape *cs = s->clone();
In contrast, when you write circle::clone as

circle *circle::clone() const
    {
    return new circle(*this);
    }
no conversion occurs inside the function, but an implicit conversion from circle *(or rectangle *or triangle *) to shape * occurs in the calling expression. The net effect is the same.

The New, Relaxed Rules

The C++ standards committee agreed that the ARM's requirement on the return type of virtual functions is a bit too restrictive. Thus, the current draft of the Working Paper (the standard-to-be) relaxes the original rule. The new rule as it appears in the Working Paper is jargon-rich and seems to change with each new draft, so I'll spare you the exact words. Here's more-or-less what it says:

For all classes B and D defined as

class B
    {
    ...
    virtual BT f();
    ...
    };
    
class D : public B
    {
    ...
    DT f();
    ...
    };
types BT and DT must be identical, or they must satisfy either of the following conditions:

1. BT is BB * and DT is DD *where DD is derived from BB.

2. BT is BB & and DT is DD & where DD is derived from BB.

In either case (1) or (2),

3. class D must have the access rights to convert a BB *(or BB &) to a DD *(or DD &, respectively).

In most common applications, BB is a public base class of DD, so D can perform the conversions. But, for example, if BB is a private base class of DD then the conversions are not valid, and BT and DT will not satisfy condition (3).

The above rules apply even if D is derived indirectly from B. Or, BB might be B and DD might be D. The latter, in fact, is the case with clone functions.

Listing 4 shows the shape hierarchy of Listing 1 rewritten using the new relaxed rules for the return type of virtual functions. For completeness, I've included all the member function bodies so you can use them to build and execute the test code in Listing 2.

Although the committee adopted these relaxed rules in March, 1992, I believe most vendors have yet to release a C++ compiler that supports them. As of early 1994, only two of the six PC-based compilers I own (Borland 4.0 and Watcom 9.5) can compile Listing 4 without error.

cv-qualifiers in Return Types

According to the current (September 1993) Working Paper, the cv-qualifiers (const and volatile) in the return types of the overriding and overridden functions need not be identical. My understanding is that the overriding function's return type cannot have any cv-qualifiers that are not also in the overridden function's return type. Listing 5 shows some examples.

Class B in Listing 5 declares virtual function f with a return type const BB *, but class D overrides it with a function that returns DD * (where DD is publicly derived from BB). Hence, if bp is a B * that actually points to a D, then

const BB *bbp = bp->f();
invokes D::f applied to *b. D::f returns a pointer to a non-const DD object, which the expression bp->f() quietly converts to const BB *.

Pointer conversions that add cv-qualifiers are always safe, but conversions that strip off cv-qualifiers are not. Thus, given

char *cp;
const char *ccp;
then

ccp = cp;
is safe, but

cp = ccp;
is not. Similarly, as you convert derived types to their base types, adding cv-qualifiers to pointer types should not make the conversions any less safe.

Listing 5 also shows that derived class D has a virtual function g returning a const DD & that overrides a function returning a const volatile BB &. This is also valid. However, if you omit volatile from the return type of B::g, then D::f is erroneous.

None of the compilers I own support this feature yet.

More to Come

Over the past two years, the standards committee has added several other new features to C++:

I will explain them all in upcoming columns.

Meeting Dates, Etc.

WG21+X3J16 will meet three times in 1994:

If all goes as scheduled, the draft standard should be available for public review and comment shortly after the July meeting.

If you would like to participate in the standards process as a member of X3J16, contact the vice-chair:

Jose Lajoie
IBM Canada Laboratory
844 Don Mills Rd.
North York, Ontario M3C 1V7 Canada
(416)448-2734
josee@vnet.ibm.com

References

Ellis and Stroustrup [1990]. Margaret A. Ellis and Bjarne Stroustrup. The Annotated C++ Reference Manual. Addison- Wesley.

Plum and Saks [1991]. Thomas Plum and Dan Saks. C++ Programming Guidelines. Plum Hall.