Columns


Stepping Up To C++

Overloading and Overriding

Dan Saks


Dan Saks is the founder and principal of Saks & Associates, which offers consulting and training in C++ and C. He is secretary of the ANSI and ISO C++ committees. Dan is coauthor of C++ Programming Guidelines, and codeveloper of the Plum Hall Validation Suite for C++ (both with Thomas Plum). You can reach him at 393 Leander Dr., Springfield OH, 45504-4906, by phone at (513)324-3601, or electronically at dsaks@wittenberg.edu.

In October, my wife Nancy suggested that our family should go to Disney World for vacation in early December. I was reluctant because I thought I had too much work to do. But that's been my excuse for weaseling out of a lot things for the past few years. OK, I let her talk me into it. But I made her promise to help me get caught up. And she did. And I almost did.

The C++ standards committee met in November and passed almost twice as many motions as ever before at a single meeting. (I'll tell you about them in the not-too-distant future.) So I had to spend an additional two or three days that I hadn't planned on writing these longer-than-ever minutes. And all my other work slipped.

So here I am now, sitting in my room at Disney's Carribean Beach Resort in Orlando, overloaded with work, while Nancy, Ben and Jeremy are over riding on the Big Thunder Mountain Railroad at the Magic Kingdom. I've got to hurry up and write this so I can join them.

Many C++ programmers, even moderately experienced ones, confuse the terms overloading and overriding. For beginners, overriding is often a new term Even before they write their first line of C++, most programmers hear that C++ offers function and operator overloading. (It's one of the features that lures them to C++ in the first place.) New C++ programmers don't encounter the term overriding until they start using virtual functions. Then they confuse the words simply because they sound alike. Beginning C++ programmers usually err by speaking of overloading when they mean overriding, simply because they don't understand the subtle differences.

Experienced programmers continue to confuse the terms because overloading and overriding have similar properties. And of course, the words continue to sound alike. It doesn't help when an overly-exuberent lecturer uses one word when he or she means the other and no one in the audience catches the gaff. I know that happened to me in some of my earliest presentations on C++. I don't believe I've made that mistake recently, but I can't promise I'll never do it again.

OK, so what's the difference? In short, overloading means declaring the same symbol in the same scope to represent two or more different entities. Overriding means redefining a name in a derived class to hide a name inherited from a base class.

In C++, you can overload both function identifiers and operators. For example,

void put(FILE *f, char c);
void put(FILE *f, int i);
void put(FILE *f, const char *s);
overloads the identifier put by declaring three different functions with that name. The predefined operator + is inherently overloaded because it already applies to a variety of operand types. The declaration

int operator+(complex z1, complex z2);
overloads operator + to handle complex numbers as well. I described many of the C++ overloading rules in detail in "Function Overloading," CUJ, November, 1991, and in a four-part series on "Operator Overloading" that appeared in every other CUJ from January through July, 1992.

The following code demonstrates overriding. Overriding applies specifically to functions in class hierarchies. For example, given

class B
    {
public:
    virtual void f();
    virtual int g(int i);
    };
the definition

class D :
public B
    {
public:
    virtual void f();
    };
derives class D from B and overrides B's f with D's own version of f. D does not override g, so D's g is exactly as inherited from B.

Overloading and overriding interact in some complex and very subtle ways. By itself, each feature is a powerful programming tool (maybe too powerful). Together, these features produce much richer capabilities than most programs need.

Overloading and overriding combine to produce some very puzzling diagnostic messages, and even more baffling run-time errors. That's the nature of C++. C++ provides sufficient capability for experienced programmers to write code that is more maintainable, yet no less efficient, than it would be in C. But in doing so, it gives naive programmers even more opportunities to get into trouble (as if C doesn't already offer enough).

Some of the things I'm about to show you are pretty intricate. They are probably more complex than anything you're likely to want to do in real C++ programs. My advice, as always, it to keep things as simple as you can. Then why would I show you things you probably don't want to do? Because you're likely to do them by accident. Understanding these examples should help you recognize and correct your mistakes. And once in a great while, you might even find a reason to do these things intentionally.

The following discussion of course assumes you're familiar with vtbls (virtual tables) and vptrs (pointers to virtual tables) as an implementation model for virtual functions in C++. I introduced them last month in "How Virtual Functions Work" (CUJ, January, 1994).

Virtual and Non-virtual Overriding

A class can contain both virtual and non-virtual functions. For a given class, the translator creates an entry in the vtbl only for each virtual function in the class, not for the non-virtual functions.

A derived class can override any of its inherited functions, be they virtual or not. When you override a function that's virtual in the base class, it automatically becomes virtual in the derived class. You can't turn off the dynamic binding when you override a virtual function. That is, you cannot override a virtual function with a non-virtual function. On the other hand, you can override a non-virtual function with either a virtual or a non-virtual function. When you override a function that's non-virtual in the base class, the overriding function is also non-virtual, unless declared so explicitly.

When a C++ translator first encounters the definition for a class D derived from some base class B, it creates an image for D's vtbl by copying B's vtbl. (As always, I am describing a conceptual model for how the translation works. Any particular implementation may do it differently.) When it parses a declaration in D that overrides a virtual function f, the translator simply overwrites the entry for f in D's vtbl to point to D::f instead of B::f. Thus, overriding a function that's virtual in B doesn't increase the size of D's vtbl.

The translator resolves all non-virtual function calls during translation, so it need not store any non-virtual function addresses in vtbls. Thus, overriding a non-virtual function with another non-virtual function has no effect on the vtbls at all. But, overriding a function that's non-virtual in B with a virtual function in D increases the size of D's vtbl, adding a new entry to D's vtbl that has no corresponding entry in B's vtbl.

Listing 1 shows a simple inheritance hierarchy that mixes virtual and non-virtual overriding. The base class, B, has four member functions, but only two are virtual. Thus, the D's vtbl has only two entries in it, for functions f and h as shown in Figure 1.

Class C in Listing 1 derives from B and overrides three of the four functions it inherits. During translation, C's vtbl starts out as a copy of B's. C::f is virtual because it overrides virtual B::f, and the compiler replaces the first entry in C's vtbl (corresponding to f) with the address of C::f. C::g overrides non-virtual B::g. Since g's declaration in C doesn't include the virtual specifier, C::g is also non-virtual. C doesn't override the h it inherited from B, so the second entry in C's vtbl (corresponding to h) continues to point to B::h.

C::j overrides non-virtual B::j, but C declares j as virtual. Therefore the compiler adds a new entry at the end of C's vtbl corresponding to j, and fills it in with the address of C::j. The resulting vtbl for class C also appears in Figure 1.

Class D in Listing 2 derives in turn from C, and overrides functions h and j. Both h and j are virtual in C, so they are also virtual in D. The translator replaces the entries for h and j in D's vtbl with the addresses of D::h and D::j, respectively. D's vtbl entry for f continues to point to C::f, as it did in C's vtbl. See Figure 1 for D's completed vtbl.

Listing 2 contains a test program that illustrates the behavior of the inheritance hierarchy defined in Listing 1. The statement

B *pb = &c;
assigns the address of c to pb, so that *pb has static type B but dynamic type C. Thus, a non-virtual member function call applied to *pb selects a member function from class B, but a virtual function call applied to *pb selects from class C. You can resolve the non-virtual function calls merely by looking at B's declaration in Listing 1. You resolve the virtual function calls by looking at C's vtbl in Figure 1.

pb->g() and pb->j() call B::g and B::j, respectively because both functions are non-virtual in B. pb->g() is straightforward because C's vtbl doesn't even have an entry for g. pb->j() can be confusing because C's vtbl has an entry for j. But the compiler always determines whether a call is virtual or non-virtual based on the static type of the object. In this case, *pb has static type B and j is non-virtual in B, so pb->j() ignores the vtbl and simply calls B::j.

pb->f() calls C::f because f is virtual in B, *pb is a C object, and C overrides f. Even though h is virtual in B, pb->h() still calls B::h. The call goes through C's vtbl, but winds up at B::h anyway because C does not override h.

I won't go over every call in Listing 2, but I will call your attention to the calls, such as pc->B::f(), that explicitly qualify the function name with the name of a base class. Again, *pc has static type C and f is virtual in C. Without a qualified name, the call pc->f() behaves like a normal virtual function call, selecting the function's address from the vtbl for D, because D is the dynamic type of *pc. Looking in D's vtbl in Figure 1 you can see that the entry for f points to C::f, so that's what gets called.

On the other hand, using an explicit base class name qualifier on the function name turns off the virtual call mechanism and uses static binding. That is, even though pc->f() is a virtual call, pc->B::f() is not. The call ignores the dynamic type of *pc and invariably calls B::f. This rule exists so that a virtual function in a derived class can call the function it overrides in a base class without getting stuck in an infinite recursion. My article on "Virtual Functions" (CUJ, December, 1993) explains this behavior in greater detail, including a fairly practical example that relies on it.

Overriding Overloaded Functions

You can overload virtual functions. That is, you can declare more than one virtual function with the same name in the same class, as in class stream shown in Listing 3. As with any set of overloaded functions, each function signature (the sequences of types in a formal parameter list) in an overloaded set of virtuals must be sufficiently distinct for the compiler to tell them apart. The vtbl for the class contains a distinct entry for each virtual function name and signature. The vtbl for class stream appears in Figure 2. No surprises so far.

Deriving from a base class with overloaded virtual functions behaves pretty much as you'd expect, as long as you override all of the overloaded functions, or none of them. But, if you derive from a base class with overloaded virtual functions and override some, but not all, of those overloaded virtual functions, the results may surprise you.

All the functions in a given set of overloaded functions must be declared in the same scope. Another declaration for a function with the same name in an inner scope doesn't add to the overloaded set; it starts a new set and completely hides all of the overloaded functions of the outer scope while in the inner scope. The inner scope can access the overloaded functions in the outer scope only by explicitly using a :: (the scope resolution operator).

A class defines a new scope. The members of a class are in the scope of that class. Thus, a single member function declaration in a class hides all the overloaded functions with the same name declared in any enclosing scope, as shown in Listing 4.

Listing 4 contains the definition for a class File, with a member function put that writes a null-terminated string to a file. File::put uses one of the overloaded put functions declared at file scope to actually put the string. Unfortunately, none of those put functions at file scope are in scope inside the body of File::put. Therefore, you must precede the call with :: to force the compiler to look for a function at file scope, as shown in the body of File::put in Listing 4. Otherwise, the C++ compiler thinks the call to put(s, f) inside File::put is a (recursive) call to File::put, but with the wrong number of arguments.

Similar behavior occurs if you derive class File from a base class that contains several functions named put. The declaration for File::put hides all the overloaded put functions in the base class while in the scope of class File. A call to an inherited put function inside a File member function must attach the base class name and a :: before the function name.

Now let's see what happens when a derived class overrides some, but not all, of the overloaded virtuals in its base class. Listing 5 shows a base class B with three virtual functions f(int), f(long) and f(char *). B's vtbl appears in Figure 3.

Class C derived from B overrides only f(int). Therefore, only f(int) is visible in the scope of C; however, C's vtbl still has three entries: one for each virtual function in its base class. C's vtbl appears in Figure 3. It has the same layout and values as B's vtbl, except the entry for f(int) points to C::f(int) instead of B::f(int).

A derived class never has fewer virtual functions (i.e., a smaller vtbl) than its base class. Some inherited virtual functions may be invisible in the derived class scope, but their addresses must still be in the vtbl. Remember, an object of a derived class is an object of its base class. A derived object has everything that its base object has, and maybe more. This applies to vtbls as well as the objects themselves. The vtbl for a derived class must have at least as many entries as the vtbl for its base class. Consider the consequences if this were not so.

The virtual call mechanism relies on the assumption that a derived object is a base object. When a compiler encounters a virtual function call applied to an object, it simply translates the call into code that follows a vptr to a vtbl and selects the address of the appropriate function. The vtbl for the object involved in the call must have at least as many entries as the vtbl for the base class, or else the call might reach beyond the end of the object's vtbl and grab something that isn't a function address.

Class D in Listing 5 derives from C and overrides only f(long). Again, its vtbl has three entries, as shown in Figure 3. The values in D's vtbl as the same as in C's, except for the value corresponding to f(long).

Listing 6 is a test program that demonstrates the behavior of the function call bindings for the hierarchy in Listing 5. The first call contains no surprises. *pb has static type B but dynamic type C. An expression that occurs to the right of pb-> is in the scope of B. In the scope of B, the compiler can choose from three different functions named f. f(1) exactly matchs f(int). f(int) is virtual in B, so the compiler generates a virtual call to f(int). At the time the program executes, pb actually points to a C object, and C overrides f(int), so pb->f(1) calls C::f(int). The next two calls behave similarly, except that C does not override f(long) and f(char *).

You may find the call d.f(1) surprising. It appears that f(1) matches f(int) exactly, so at first it seems that it should call D::f(int). But D doesn't override f(int), so shouldn't the expression actually call C::f(int)? Well, that's not what happens either. The expression to the right of d. is in the scope of D, where only f(long) is visible. Therefore the compiler promotes the argument 1 to 1L and calls D::f(long).

You can study most of these calls on your own. I'll draw your attention to a couple of interesting cases.

The call d.C::f(1L) uses explicit qualification to access an inherited function that's otherwise hidden. It looks like it should call an f(long), but it dosen't. The expression to the right of the qualifier C:: is in the scope of C, where only f(int) is visible. The compiler converts 1L to 1 (an int) and calls C::f(int). The explicit qualifier turns off the virtual call mechanism.

The call pc->f("hello") is an error. The expression to the right of pc-> is in the scope of C, where only f(int) is visible. "hello" has type char *, and there's no standard conversion from char * to int.

Behaving Responsibly

As I mentioned earlier, I'm not suggesting that you'd ever want to write code like this. Quite to the contrary, I'm trying to shine a little light into a dark corner, and show you how can inadvertantly write some pretty confounding stuff.

I think overloaded virtual functions are a useful feature. Common classes like the istream and ostream classes in iostream.h use this feature well. When you drive from a class with a set of overloaded virtual functions, you should override all or none of the functions in that set. In fact, this is a good guidelines even if the overloaded functions are non-virtual.

Many compilers actually warn you when you violate this guideline. For example, when you compile Listing 5 and Listing 6 together, you may get a warning to the effect that the declarations of f(int) in C hides the declarations of f(long)n and f(char *) inherited from B.

As with any guideline there are exceptions, but in this case they are rare. Remember that function and operator loading are there to help you write more intuitive code. If overloading makes it less so, then back off.

So, Dan Saks, you've just finished your CUJ article. What are you going to do next?