C/C++ Contributing Editors


Uncaught Exceptions: Eroica

Bobby Schmidt

Bobby addresses a host of C++ subtleties, which he identifies by the bite marks on reader-supplied code.


Copyright © 1999 Robert H. Schmidt

Like Beethoven in his Third Symphony, I'm giving scant introduction this month. The column is long (with ten footnotes!), and my wit and pith are used up. Those who live for these intellectual hors d'oeuvres must find cerebral snack food elsewhere 'til next month.

Oh the Pain!

Q

Greetings Sir!

I have a project I am working on that requires a fixed-precision type for some computations. I wrote such a class, the relevant portions of which are included below.

The code has been (apparently) working fine for some time, but I was surprised the other day when a coworker discovered during debugging that the addition operator in main never calls the defined addition operator on the Fixed class. Instead, it converts the Fixed to a double, does the math as double, and then converts it back to Fixed. Is this the correct behavior? If so, is it safe to assume that built-in data types have a higher precedence when performing implicit promotion? — Ethan D. Frolich

A

The results you see are correct. I've edited your example to better show what's going on:

class Fixed
   {
public:
   Fixed &operator=(double);
   operator double() const;
   friend Fixed operator+
      (Fixed const &, Fixed const &);
   };
     
// ...'Fixed' member functions...
     
int main()
   {
   double d(1);
   Fixed f;
   f = 2;
   f = f + d;
   return 0;
   }
     

You expect that the + in f + d will call your operator+. Were that expectation correct, your original statement

f = f + d;

would be equivalent to

f = operator+(f, d);

As an experiment, replace your original addition statement with this explicit equivalent. You should find that your program fails to build. This result implies that your operator+ was never called implicitly in the first place, just as your coworker discovered.

operator+ expects two Fixed arguments. But when you call

operator+(f, d);

you are passing operator+ a Fixed argument and a double argument. For the call to work, the double argument needs to convert to a Fixed value.

I've mentioned this in print several times over the years, but it bears repeating here: for any arbitrary type A to convert to another arbitrary type B, at least one of the following must be accessible:

C++ has no standard conversion from double to Fixed. And since double is not a class type, it can't have a conversion operator. The only alternative is to have Fixed support a conversion constructor that takes a double. If you add the constructor

Fixed(double = 0.);

to the Fixed class, the explicit call

operator+(f, d);

works. (I gave the constructor a default argument, so it can work in contexts where you had been calling the compiler-synthesized default constructor.)

Unfortunately, if you change the statement back to the original

f = f + d; // error

your compiler should complain about type ambiguity. Because f can turn itself into a double via Fixed::operator double, the expression

f + d

can be interpreted as

double + double

But d can also be turned into a Fixed via our new conversion constructor, meaning the expression can also be interpreted as

Fixed + Fixed

Since each candidate interpretation requires a single conversion, the compiler can't pick a "best" match.

This is a frequent problem when a class has both conversion constructors and conversion operators interchanging with the same type, especially when that type is built-in. You need to somehow remove the ambiguity, either by eliminating one of the competing interpretations, or by making your intent overt. Simple solutions:

The best choice depends on your design needs and personal aesthetic, although I'd lean toward the third option (eliminating the conversion operator); otherwise you'll face similar ambiguities when you go to add other Fixed operators.

Bad Templates

Q

Suppose I write a template class:

template<class T> class Foo {
... public:
void bar(T t); };

and I want to provide a non-inline definition for the member function bar. Why do I have to write:

template<class T>
void Foo<T>::bar(T t) { ... }

instead of

template<class T>
void Foo::bar(T t) { ... }

In other words, by providing a T argument to the template Foo, what am I telling the compiler it shouldn't already know?

Since I cannot create both a template class Foo and a non-template class Foo in the same program (doesn't the Standard forbid it?), why do I need to qualify Foo with an argument? Is this an inconsistency in C++ template syntax?

Here's another way to ask the question: when you define a member function within the scope of a class definition, you don't have to provide template arguments. In fact, you don't even have to provide the name of the class it's a member of! The class definition itself provides no template arguments.

Why then, should the simple act of moving the member function definition outside the scope of the class definition suddenly require us to provide template arguments? What information is lost in this process that would require the addition of information via template arguments?

There can be no question which template class this function is a member of. It is the template class Foo parametrized by a single type. No information is lost, as far as I can see. — Leonard Pinth-Garnell

A

Let's consider your example:

template<class T>
class Foo
    {
    void bar(T);
    };
     
template<class T>
void Foo<T>::bar(T)
    {
    }

According to the C++ Standard, what precedes the :: scoping operator must be a namespace name or class name [1]. Foo is not a namespace name or class name — it is a template name [2], and cannot appear before ::. On the other hand, Foo<T> is a class name [3], and thus can appear before ::.

And yet as you note, that same bare name Foo can appear in the class template definition sans <T>:

template<class T>
class Foo // no <T>, OK
    {
    Foo *p1;    // no <T>, OK
    Foo<T> *p2; // <T>, also OK
    };

In fact, if you try to explicitly add the <T> to the template declaration name:

template<class T>
class Foo<T> // error

your compiler gives an error [4].

You see this is as inconsistent. Since the compiler can clearly infer <T> in the class template definition, you wonder why it can't similarly infer <T> in the function template definitions associated with that class template.

For your desired construct

template<class T>
void Foo::bar(T)

to be allowed, the grammar would have to be modified. Probably the least intrusive change: interpret Foo as a true class name tantamount to Foo<T>. Assuming the grammar were so modified, the question then becomes: could the compiler unambiguously make the correct interpretation?

Dan Saks and I talked this over, and our tentative answer is "yes." Neither of us can conjure a scenario where the compiler could not properly infer <T>. For now, my conclusion is that some other reason compelled the Standard committee to require <T> here. I looked in both the ARM [5] and the D&E [6], but could find no decisive rationale in either.

I invite Diligent Readers, especially those involved with C++ standardization, to provide me some insight here. If I get what appears to be a compelling rationale, I'll publish a follow up.

It Slices, It Dices!

Q

In my enclosed example I get a bizarre run-time error complaining about caller and callee calling convention mismatch. In the real code I get a complaint about calling a pure virtual.

Why doesn't this call to for_each work? — David X Callaway

A

I believe you are seeing correct (if non-intuitive) behavior.

I've taken the liberty of reorganizing and renaming your code pieces to show the flow more clearly:

#include <algorithm>
#include <iostream>
#include <vector>
using namespace std;
     
struct base_func
    {
    virtual void operator()(int) = 0;
    };
     
struct derived_func : base_func
    {
    void operator()(int)
        {
        cout << "derived" << endl;
        }
    };
     
int main()
    {
    vector<int> v(3);
    vector<int>::iterator it;
    derived_func df;
    base_func &bf = df;
    //
    // this runs correctly:
    //
    for (it = v.begin(); it != v.end(); ++it)
         bf(*it);
    //
    // this blows up:
    //
    for_each(v.begin(), v.end(), bf);
    //
    return 0;
    }

for_each is a function template. If you crack open <algorithm>, you'll find that for_each is implemented as something like:

template <class It, class F>
F for_each(It it, It last, F f)
    {
    for (; it != last; ++it)
        f(*it);
    return f;
    }

The two template parameters are deduced from the corresponding function arguments. In our example, the first template parameter is deduced from the static type of v.begin() and v.end(), while the second template parameter is deduced from the static type of bf. By implication, you'd expect the instantiated function to be

base_func &for_each
    (vector<int>::iterator it,
     vector<int>::iterator last,
     base_func &f)
    {
    for (; it != last; ++it)
        f(*it);
    return f;
    }

The expression f(*it) is really f.operator()(*it). Because that function is virtual, the override of operator() called depends on f's dynamic type. Since the function argument bf has dynamic type derived_func, the parameter f should have dynamic type derived_func as well.

Were that the case, the for_each code would run as you want — yet you claim it doesn't. Why the discrepancy? Because your compiler actually deduces the template as [7]

base_func for_each
    (vector<int>::iterator it,
     vector<int>::iterator last,
     base_func f) // <<<=== NOTE!
    {
    for (; it != last; ++it)
        f(*it);
    return f;
    }

Look closely at f's declaration. In this correct instantiation, f is passed not by reference, but by value. By-value parameters don't exhibit polymorphism — their static and dynamic types are the same, regardless of their corresponding arguments' dynamic types. (This phenomenon is known among C++ literati as the dreaded Slicing Problem.)

f therefore has the dynamic type base_func, meaning base_func::operator() is called. That function is declared pure virtual and has no implementation. The result is undefined behavior, which typically manifests as a call either into "nothing" or into implementation-specific code that traps pure virtual calls.

A simple solution: have the compiler deduce from a non-reference:

for_each(v.begin(), v.end(), df);

Alternatively you can bypass deduction via explicitly specifying the template arguments:

for_each<vector<int>::iterator,
    base_func &>
    (v.begin(), v.end(), bf);

I also recommend that you provide stub bodies for your pure virtual functions, at least in the early debugging stages of your projects. Such a technique lets you readily identify calls to those functions. One possible solution is

#if defined DEBUG
    #define PURE \
        { \
        assert(!"pure virtual"); \
        }
#else
    #define PURE \
        = 0;
#endif
     
class C
    {
 virtual void f() PURE
    };

The assert has the side effect of writing the offending source file name and line number. You can of course replace assert with a break into a debugger, a stack trace with symbols, or some other diagnostic aid.

Heads or Tails

Q

Bobby,

I've come across an issue which I'm sure must have been brought up before but I've never seen discussed.

I'm a big proponent of interface/implementation separation. To this end I ensure that all methods of any class, which compose an interface to a shared library, are contained within that shared library and not defined in a header file. This insures that if the size of the class (or order of its components) changes, I merely need to re-compile the class module and build the shared library. Or so I thought.

The big problem with this design (class interface to shared lib) is that myClass *ptr = new myClass is effectively converted to:

myClass *ptr =
    (myClass *)
    malloc(sizeof myClass);
myClass::myClass(ptr);

...and these news are in the program, not the library, leaving information about the implementation (sizeof myClass) stored in the program!

I've only tested this with my Solaris compiler but it seems unlikely that I'd see different behavior in another compiler. Are there compilers out there that address this problem?

The only solutions I see would be:

As an example of solution #2:

class RealMyClass {
    private:
    int a;
     
    public:
    RealMyClass();
    ~RealMyClass();
    myMethod();
}
     
class myClass {
    private:
    RealMyClass *ptr;
     
    public:
    myClass() { // would actually reside in library not header
        ptr = new RealMyClass;
    }
    myMethod() {
        ptr->myMethod();
    }
}

This really doubles the amount of code that comprises the interface to each class. Is there a more elegant way to address this problem?

Thanks, — Mike Morgan

A

You've encountered a classic feature/limitation of the C++ object model: the tight coupling between interface and implementation. Both manifest as the same object instance; they are in many ways synonymous, two sides of the same thin coin. As a result, when you dynamically create a new object interface you are simultaneously creating a new object implementation attached to that interface. To create the interface/implementation pair, the compiler must know how big the implementation is.

As you point out, a new expression implicitly calls the function operator new. That function's first parameter is a size_t byte length for the new object. Even though an abstracted interface is purposely ignorant of the implementation's existence, let alone its length, this information must nonetheless be available to operator new.

Every solution I've seen to this problem involves a variation of what you've proposed:

This solution requires the very thing you want to avoid: two classes and two actual C++ objects for each conceptual interface/implementation instance. Yet as you've discovered, once you decouple interface from implementation, you are free to modify that implementation invisibly to interface clients. In fact, you can even replace the local C++ implementation with one built from a completely different language or running on a different machine [9].

In sum, you can't create any physical C++ object via new without knowing the object's size. Yet by adopting a separated interface/implementation design, you can create a logical object from two physical C++ objects: a constant interface and a potentially-varying implementation [10].

Notes and References

[1] As given in clause 3.4.3p1 ("Qualified name lookup"). See my column in the July 1997 CUJ for more detail.

[2] Inferred from clauses 9p1 ("Classes") and 14p2 ("Templates").

[3] Clauses 9p1 and 14.2p1 ("Names of template specializations").

[4] Clause 14.5p1 ("Template declarations").

[5] Bjarne Stroustrup and Margaret Ellis. The Annotated C++ Reference Manual (Addison-Wesley, 1990).

[6] Bjarne Stroustrup. The Design and Evolution of C++ (Addison-Wesley, 1994).

[7] This allegation is based on my laborious spelunking through the C++ Standard, backed by experiments on several translators, including EDG's.

[8] Although in most circumstances the actual C++ object size will be non-zero.

[9] To see such separation taken to the extreme, I invite you to investigate Microsoft's COM (Component Object Model).

[10] If your interface calls the implementation through virtual functions or some other dispatch mechanism, you must ensure that the vtable entries (or their equivalents) don't change over time. To prevent this problem, the rules of COM forbid you to change an interface once it's published.

Bobby Schmidt is a freelance writer, teacher, consultant, and programmer. He is also an alumnus of Microsoft, a speaker at the Software Development and Embedded Systems Conferences, and an original "associate" of (Dan) Saks & Associates. In other career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via rschmidt@netcom.com.