July 1999/Uncaught Exceptions

C/C++ Contributing Editors

Uncaught Exceptions: Building Sand Castles

Bobby Schmidt

A plea for writing Standard C++ instead of (even very popular) dialects, followed by an assortment of clever Standard C++ tricks.

Copyright © 1999 Robert H. Schmidt

I'm in the midst of an e-debate with Scott Meyers over virtual base classes. Like the proverbial blind wise men describing an elephant, Scott and I each see different "truths" depending on which part of the C++ Standard elephant we grab. I hope to have a resolution next month; if so, I'll summarize the debate here.

The C++ Standard captures the collective experience and wisdom of many people over many years, yet the work is still under-specified and fallible. This is not a slam on the Standard or its authors, but rather is a testament to the complexity of what's being standardized. In some ways the C++ language might be mankind's most complex logical invention yet.

I'm left wondering how much we can trust other software technologies of less deliberate pedigree. For example, I first learned Windows version 2.1 in 1989. Even then I found the program interface specification to be woefully inadequate. Now, if the popular press is to be believed, the source for Windows 2000 numbers into the tens of millions of lines. Can the collateral specification possibly be even close to complete?

Consider: in the time since The Annotated C++ Reference Manual (a.k.a. The ARM) first appeared, Microsoft has released no fewer than six major Windows versions, significantly expanding the breadth and depth of the program interface each time. Many features of Standard C++ are still largely compatible with the original ARM. In contrast, how much of today's Windows programming is still compatible with the contemporary Windows 3.0 SDK?

This is the principle reason I stopped being a Windows-only programmer years ago, and started writing for a Standards-oriented magazine like CUJ. I tired of building sand castles, spending time and energy mastering a technology, only to watch the next commercial release wash it away. The language standardization pace may rival that of continental drift, but the end result will stand long after the next Windows tsunami.

Chicken and Egg

Q

Dear Bobby,

I've been trying to write a function that returns a pointer to itself. I can't figure out how to declare it. So far, I've been able to do the following:
typedef void * (*PFPV)();
     
void *f()
    {
    /* ... */
    return f;
    }
     
int main(void)
    {
    PFPV p = f;
    while (p = (PFPV) p())
        ;
    return 0;
    }
I'd like to get rid of the cast in main so that the while loop is:
while (p = p())
    ;
I thought up this piece of code one day while experimenting with functors and wondered what it could be used for. I think I could implement a state machine with it, using functions to implement the different states.

Anyway, every time I try to figure out how to declare the return type of f, my mind seems to go into a recursive loop: f is a function that returns a pointer to a function that returns a pointer to a function that returns ...

Can you help?

Thanks. — Marlon Nelson

A

This is an intriguing problem, one I've not thought about before. To better see the dilemma, consider the C or C++ function:
T f()
    {
    return f;
    }
If T is to truly represent a pointer to f — that is, if T is to be a pointer to a function returning itself — then it needs a definition like
typedef T (*T)();
Sadly, such a self-referential definition won't compile. To get around this, you might be tempted to introduce a second definition:
typedef U (*T)();
T now references U instead of itself, apparently resolving the self-reference problem. But what is U? Well, if T is supposed to be a pointer to f, U represents what f returns. Since f returns a T, we are led to
typedef T U;
typedef U (*T)();
This is hardly an improvement.

You can try other detours — a pointer to a pointer, an out parameter instead of a return value, a reference instead of a pointer — but you'll wind up in the same dead end: T can't be a pointer to f. In fact, T can't even be another function pointer type that converts from a pointer to f [1].

Well then, can you convert a pointer to f into a non-function pointer? As you found, the apparent answer is "yes":
typedef void *T;
Regrettably, this solution has two principle flaws:

Once f converts itself to a void *, all of its type information is lost. p has no clue that it holds the address to a function, let alone that the function has f's signature.

According to the language Standards, void * can hold any object pointer [2] — but the address of f is a function pointer, not an object pointer. As a result, the above code is not conforming, although it might still work on your implementation [3].

If any Diligent Reader has a portable and robust solution to this problem, please let me know.

Scoped Enumerations

Q

I recently discovered a problem with the following code. I think it's valid C++ code; someone else said it isn't, but we are not sure. Perhaps you could enlighten me a bit, and I would like to hear your opinion about this:
class foo
    {
public:
    enum button
        {
        ADD = 1024
        };
    enum test
        {
        test1 = button::ADD + 1024
        };
    };
     
int main()
    {
    int x = foo::button::ADD;
    // ...
    }
The question: is the button::ADD usage as an enumerator definition correct or not? And can I use the code shown in main? I think yes, the other guy thinks no. He says that this usage is only valid as a nested-name-specifier, I think the :: scope operator can be used to qualify type-names.

Who is right? The code above is accepted by VC++ but other compilers fail. — Robert M. Muench

A

What interesting timing — this same question recently made the rounds on comp.lang.c++. I'm sorry to tell you, but "the other guy" is right. According to the C++ Standard in 3.4.3p1 ("Qualified name lookup"):

The name of a class or namespace member can be referred to after the :: scope resolution operator applied to a nested-name-specifier that nominates its class or namespace. During the lookup for a name preceding the :: scope resolution operator, object, function, and enumerator names are ignored. If the name found is not a class-name or namespace-name, the program is ill-formed.

Translation: what comes before :: must name either a class or a namespace. Since an enumeration type is neither of those things, an enumeration type name cannot come before ::. This implies that C++ enumeration types do not form their own scopes, an admittedly confusing and inconsistent rule inherited from C.

As was the case last month, the fact that Microsoft's compiler accepts this does not imply that the code is well-formed [4]. Instead of relying on vendor-specific behavior, try
class foo
    {
public:
    struct button
        {
        enum
            {
            ADD = 1024
            };
        };
    // ... rest same as before
    };
     
int main()
    {
    // ... same as before
    }
Now button is a class name that can — and in fact typically must — appear before :: as you desire.

Better Mousetrap

Q

Hi Bobby,

I just read your column in the February issue of CUJ (yeah, your monthly build 38 :-). Just as you asked for, here is a better lengthof template that yields a compile-time constant.

The basic idea is simple — if you apply sizeof to a function call, the call is resolved but the function is not called, and the result is the size of the return type as a compile-time constant. Example:
template< int N > struct Sized
    {
    char x[ N ] ;
    } ;
     
template< class T, int N > Sized< N >
        lengthof( T (&)[ N ] )
    {
    Sized< N > x ;
    return( x ) ;
    }
     
// use: sizeof( lengthof( a ) )
// examples:
int main()
    {
    int a[ 10 ] ;
    // in compile-time constant:
    int b[2*sizeof(lengthof( a ))];
    std::cout << sizeof(lengthof(b))
              << std::endl ;
    return( 0 ) ;
    }
You can easily turn this into a macro:
#define LENGTHOF( a ) sizeof( lengthof( a ) )
or if you are on the "no macro" side, you can try to find a name for lengthof that makes the whole expression readable, like sizeof( thearray( a ) ). I would probably use a macro.

This technique is still not good for local types, which is unfortunate since the type T is useless for the template; but as you said, local types are not the most commonly used feature in C++.

Now this could be extended to multidimensional arrays, with functions to get the rank and the dimension along each axis as compile-time constants, but hey, it's your column, not mine :-).

Have fun. — Carlo Pescio

A

Thanks for this technique. I don't know that I would have ever thought it up on my own.

Like you, I would favor the macro. With such a design, I'd change the name of the function template to lengthof_ or something similar. This would help users avoid referencing the template directly, yet would provide a meaningful name in the intermediate code emitted by the preprocessor.

As for the multidimensional variation, I leave that as the dreaded Exercise for the Student.

Two Great Tastes

Q

Dear Bobby:

I have a project that I am working on. I want to know if you can mix macros and templates. For example, is this code legal:
#include <iostream.h>
using namespace std;
     
#define _ntostr(name) #name
     
template <class T>
void printout(const T &t)
    {
    cout << _ntostr<T> << " value: "
         << t << endl;
    }
     
int main()
    {
    int x = 5;
    // Should print "int value: 5"
    // if this technique works
    printout(x);
    return 0;
    }
and will it print the desired result? Why or why not? — Paul Drees

A

No and no. And as Mama taught me, two wrongs don't make a right.

You define the macro _ntostr to accept one parameter, yet invoke it with no corresponding arguments. I'm guessing that you expect _ntostr<T> to treat T as the macro argument. Lamentably, the preprocessor doesn't know about templates, and doesn't "see" arguments between <>. You have to use () instead, and invoke the macro as _ntostr(T). Once you make this change, the code will compile and run — but the result will not be what you desire. Instead of writing
int value: 5
printout will instead write
T value: 5
Again, the preprocessor knows nothing of templates. When it scans the sequence _ntostr(T) in your original program source, the preprocessor naively substitutes the string literal "T". Accordingly, in the code stream emitted by the preprocessor, your source has become
template <class T>
void printout(const T &t)
    {
    cout << "T" << " value: "
            << t << endl;
    }
If your translator lets you capture the preprocessor's output, you can see this effect more clearly [5].

I want to add two more unrelated notes:

For reasons I've explained in earlier columns, I strongly recommend you avoid identifiers (like _ntostr) with leading underscores, especially when those identifiers are global-scope or macros [6].

You should use the header <iostream> instead of <iostream.h>. The latter is no longer part of the C++ Standard.

No Temps Allowed

Q

Can you help prevent a common mistake? I frequently create handy little classes that save and restore current state information. I call them stack-based classes. Some people use them to prevent resource leaks, others for avoiding resource locks. The problem I have with them is that I often misuse them.

Consider the following class that saves and restores the current working directory:
class StDir
{
public:
   StDir( std::string inNewDir )
   {
      _getcwd(_PreviousDir, _MAX_DIR);
      _chdir( inNewDir.c_str() );
   }
   ~StDir()
   {
       _chdir( _PreviousDir );
   }
private:
   char _PreviousDir[_MAX_DIR];
};
along with the common use:
if( a==b )
{
    StDir theDirRestorer("\\temp");
    ...
}
and the common misuse:
if( a==b )
{
    StDir( "\\temp");
    ...
}
The problem in the misuse case is that the object is created and destroyed in a single statement. The intention is to have the object exist until the {} block exits. While I understand there may be situations where a single statement create and destroy may be meaningful, these stack-based classes are not among them.

This common misuse has proven to be a stumbling block for me over and over again. Do you know of any way to prevent it?

Thanks. — Bob Boylan

A

What you really want are constructors that work for named objects (like theDirRestorer) but not for unnamed objects. If there's a way to write such constructors, I sure can't figure it out. About the closest I can come is to declare the constructor explicit, thereby preventing the related
StDir theDirRestorer = "\\temp"; // error
Fortunately for you, I was inspired by reader Drees' macro question and finally hit upon
#define StDir(x) unnamed_StDir
where unnamed_StDir is never defined. When you compile the undesired
StDir("");
the preprocessor snags this as a reference to the macro StDir(x), replacing your original source with
unnamed_StDir
As unnamed_StDir is undefined, you should get a reasonably useful compile-time message like
undefined identifier 'unnamed_StDir'
pointing you to the real error.

Conversely, when you compile the desired
StDir theDirRestorer("");
all works as before. The macro is expecting a single argument; but as this usage of StDir has no arguments, the preprocessor ignores it.

Net result: "good" usage stays intact, while "bad" usage maps to meaningful compile-time errors.

Finally, before a certain someone sends me nag mail suggesting I haven't read his book well enough, I must recommend you change
StDir( std::string inNewDir )
to
StDir( std::string const &inNewDir )
thereby avoiding the unnecessary creation of a temporary string object.

Notes

[1] C++ Standard section 13.4 paragraph 7 ("Address of overloaded function").

[2] Were this not true, the pointers returned by malloc and operator new could not be used for dynamic object creation.

[3] The C Standard explicitly describes conversions between object pointers and function pointers, or between void * and function pointers, as a non-portable extension (section F.5.7).

[4] Last month I speculated on reasons for MSVC's non-conformance. To read an illuminating and relevant interview with one of Microsoft's own, check out http://msdn.microsoft.com/visualc/headlines/joninterview.asp.

[5] I know that both EDG's front end and Microsoft's VC++ support -E and -P switches to generate preprocessor output. I don't know if, or how, other translators achieve the same result.

[6] See section 17.4.3.1.2 ("Global names") of the C++ Standard for supporting evidence.

Bobby Schmidt is a freelance writer, teacher, consultant, and programmer. He is also an alumnus of Microsoft, a speaker at the Software Development and Embedded Systems Conferences, and an original "associate" of (Dan) Saks & Associates. In other career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via rschmidt@netcom.com.