January 2002/Uncaught Exceptions

C/C++ Contributing Editors

Uncaught Exceptions: Retreat

Bobby Schmidt

Have you ever wondered why no one ever tells Language Lawyer jokes? It may be because we need them so badly. The mighty Barrister shows his stuff again this month.

Copyright © 2001 Robert H. Schmidt

I’ve just returned from “The C++ Seminar: Three Days With Five Experts.” I quite enjoyed meeting and chatting with several of you Diligent Readers. I also enjoyed getting drafted to play moderator for some of the panels; it reminded me a lot of my radio days, which I sometimes miss horribly. I think the seminar went quite well, especially when considering that none of the presenters had ever done anything like it before.

The seminar was a three-hour drive from my home. I rode down with Andrei Alexandrescu and rode back with Scott Meyers. During our drive, Scott and I spent quite a while ruminating on my career. Along the way, he made a suggestion about my column, which I think I’ll adopt.

Scott sometimes finds me and others using a style he calls “retreating to the Standard”: relying on quotes from the C++ Standard to “prove” or justify the theoretical basis of compiler or program behavior. Scott believes that lengthy Standard citings detract editorially from writing, especially given that most readers don’t have the Standard. He suggests I simply aver that the Standard supports my allegations without quoting or even citing chapter and verse. In effect, he trusts that enough of the time, my interpretation will be right enough.

It’s an intriguing idea, one that I’ll try on for a while to see how it fits, starting with this column.

As for my career: exactly one day before I turned 40, and without any warning or consultation, MSDN canceled my C# writing/community project. Two weeks later came September 11th. The combination has put me in a place familiar to many of you: reassessing the purpose and priorities of my life.

I always knew the day would come that I stopped writing about programming. At least within the context of my Microsoft career, I believe that day is likely here. I am instead feeling pulled — indeed, almost called — to a very different direction, to find or create a position of moral leadership within the company. I’m not yet sure exactly what that means. I know only that the stirring has grown quite strong.

To help me gain clarity, I’m taking a solo road trip to visit my father in his new California home. The trip and visit will be a retreat of sorts, away from home and job. I plan to take in some stunning natural vistas along the way: Crater Lake, Mount Shasta, Lake Tahoe, the wild Pacific coast.

I’m not looking for answers on this trip, just the clarity to eventually find answers. One way or another, those answers will affect what I write here in CUJ. Stay tuned.

When Life Begins

Q

Hi Bobby.

I have a question about a previous answer you gave. Maybe you already answered a question like this before. If you did, I will appreciate it if you will tell me where I can find the answer.

You wrote in February that an object starts to exist only when the last line in the constructor finishes. If so, why does the compiler give me the option to use this inside the constructor? (I tried it with gcc and VC++, and the program worked fine!)

Regards — Tony Abramson

A

I’ll answer your question in two parts: what this means in a constructor body, and how you can use it.

What this means:

In the context of a non-static member function, the keyword this is a pointer to the object for which the function is called.

A constructor is a non-static member function.

Ergo, within a constructor, this points to the object being constructed.

How you can use it (including an apparent contradiction or ambiguity):

Before its constructor returns, an object’s storage has been allocated, but its lifetime has not yet begun.

You can hold a pointer to such a prenatal blob of storage, but are limited in how you can use that pointer.

One such limitation: you can’t access non-static members through the pointer.

This combination suggests that within a constructor, you can’t reference non-static members of the object under construction without hitting undefined behavior. And yet we make such references all of the time via this. Consider:
class X
    {
public:
    X()
        {
        i = 7; // HERE
        }
private:
    int i;
    }
which is equivalent to:
class X
    {
public:
    X()
        {
        this->i = 7; // HERE
        }
private:
    int i;
    }
Does such usage engender undefined behavior? My literal reading of the Standard suggests yes; yet that can’t be right, lest many common idioms fail.

The prose in C++ Standard Subclause 3.8 concerning object lifetime could stand some tightening and does appear to preclude this in constructors. But I don’t think for a moment that’s the intent. My bottom line advice: this works in constructors as you probably expect, so long as you don’t engage in genuinely undefined behavior.

(Note of supreme irony: I recommend one such usage of undefined behavior in my October column. I discuss this usage in the Erratica section below.)

using Blues

Q

In reference to “Namespace Madness”:

As soon as I saw the question, “How do I control the scope of using namespace within header files?” I immediately reacted, “Never do that.”

Basically, I recommend never writing a namespace using in a header file, never ever ever cross your heart and hope to die, because among other things using declarations in particular can make you sensitive to header file inclusion ordering and actually violate the ODR (making the whole program ill-formed).

I know the original reader was talking about a using directive, not a using declaration, but I recommend all of those never appear anywhere near a header file — where “anywhere near” means even in source files before the last #include.

Just a data point; discussion is welcome! — George Kaplan

A

You are talking about my September 2001 column, where reader Stewart Trickett wants to reference std::string and std::vector as string and vector in a header file, but only in the header file. Most of my proposed solutions involve using declarations. You advise that such declarations, and their using-directive brethren, are bad news in the context of header files.

You raise a valuable safety issue I hadn’t considered in my initial answer, one I’d like to briefly explore here.

A using declaration redeclares a namespace member in the scope containing the using declaration. In effect, the original declaration is cloned at a new point:
#include <string>

namespace N
    {
    using std::string;
    //
    //    From here to the end of 'N',
    //    'string' is declared in both
    //    the scope of 'std' and
    //    the scope of 'N'.
    //
    void f()
        {
        std::string s1; // OK
        N::string s2;   // OK
    // OK, finds 'N::string'
        string s3;      
        }
    }
A using directive has a related but different effect:

An entire namespace is affected, not just one member.

Member names are not actually declared in a new place; they simply appear that way during lookup.

The lookup rules for unqualified names are more complicated than with using declarations. This is particularly true when multiple affected namespaces declare the same name.

Because using clauses shift the apparent location of declarations, they can fool code not expecting such shifts. Further, the using clauses themselves are sensitive to the declaration context preceding them. Finally, using declarations can hide names introduced by using directives. These considerations can lead to surprises and even undefined behavior:
//
//    header.h
//
namespace N
    {
    typedef char *string;
    }

int abs(int)
    {
    }

//
//    source1.cpp
//
#include <cstdlib>
#include <string>
using namespace std;

#include "header.h"

void f1()
    {
    string s1; // uses ’std::string’
    using N::string;
    string s2; // uses ’N::string’
    }

void f2()
    {
    string s1;
    using namespace N;
    string s2; // error, ambiguous
    }

void f3()
    {
    abs(1); // error, ambiguous
    using namespace std::abs;
    abs(2); // uses ’std::abs’
    }
So I second your advice: don’t put using clauses in, before, or among headers. Save them for after inclusion of all headers. One allowable deviation: create a canonical include-me-last header (such as stdafx.h in Windows programs) and append the using clauses to that header.

To Serve Man

Q

Hello Bobby,

I have a question for which I hope you can help find the answer. It may elude you why I would seek an answer to such a question, and I’m not even sure I could explain. Nevertheless, here goes.

During the mid-late 80s, using the old Lattice C compiler v3.x, I was delayed for several hours on a compiler bug (actually, Lattice said it wasn’t a bug, but a definition of the language). I wrote a small C function that moved memory from src to dest.
void mov_mem(byte *src, 
             byte *dest, unsigned len)
    {
    while (len—)
        {
        *dest++ = *src;
        *src ^= *src++;
        }
    }
I can’t remember the assembler code that the compiler generated, but the result was that in the second line of the loop body, src was incremented before the XOR operation was performed. However, if dest and src were word-size objects, then the correct code was generated.

I’m trying to remember why that is correct according to the definition of the C language. Of course, as we know, K&R, as well as ANSI, left much of the language to be implementation specific, and therefore up to the compiler author.

Regards — Dan a.k.a. “TheScot”

A

You’re in a maze of twisty passages, all alike.

Oops, sorry — that’s the C++ Standard.

Here we go: you’re traveling through another dimension, a dimension not only of reason and rules but of whim — a journey into a wondrous land whose boundaries are that of your implementation. Next stop: The Unspecified Zone.

Your code manifests unspecified behavior. In the statement:
*src ^= *src++;
you don’t know when the side effect of ++ will occur relative to the assignment via ^=. Apparently on your implementation, the side effect occurs before the assignment for byte-size operands, but after the assignment for word-size operands. You deem the latter behavior “correct.”

The C Standard does not compel C implementers to document unspecified behavior, although implementers are free to do so if they choose. The Standard also does not compel implementers to guarantee the predictability of said behavior, a sad fact you discovered the hard way. The Standard does compel them to treat the program as valid C, so that the program must compile and run, albeit with possibly unpredictable results.

In the end, Lattice told the truth. The problem was not a compiler bug, and they were operating within the rights granted them by the C Standard.

Assuming this is even an issue anymore, you can work around the problem by decoupling the assignment from the increment. One possibility:
while (len—)
    {
    *dest++ = *src;
    *src ^= *src;
    ++src;
    }
Note: all of the above applies to C++ as well.

Further note: I cut my teeth on Lattice C as well. I first learned C on both Lattice and Aztec’s compilers for MS-DOS in 1986.

Erratica

Every so often I publish something embarrassingly wrong. Such is the case in my October 2001 column item “The Long And Winding Road.” What’s supremely ironic is that the item in part conflicts with advice I give this month (regarding this in constructors).

Several Diligent Readers have noted my lapse. The most complete commentary comes from a Canadian reader who wishes to remain anonymous. For editorial purposes, I shall call him Dudley Do-Right.

The code to which Dudley objects nets out to:
class X
    {
public:
    X(int i)
        {
        // ...
        }
    X()
        {
        this->X::X(0); // oops...
        }
    }
Dudley has three objections:

The code shouldn’t compile.

Calling constructors for code reuse is bad.

Reconstructing an unconstructed object is bad.

I’ll address each point in turn.

Point #1: Dudley is right — the code won’t compile on a Standard-conforming implementation. I implied the code would, or should, which is just plain wrong. My bad.

I routinely test all of my published code, even for “trivial” examples that I should be able to compile by eye. This example astonishingly compiles on Visual C++ and Metrowerks CodeWarrior. (If I can take any comfort from my mistake, it’s that I am in good company.) Happily and comfortingly, the EDG front end catches the error. That I didn’t catch the error tells me I forgot to run the code through EDG’s translator. Oops.

Point #2: In my original column, I write that “on the surface [constructors] make sense as a code reuse mechanism.” I stand by that statement. On the surface, or in the abstract, they do make sense this way. That the rules of C++ disallow such practice doesn’t negate their attraction or theoretical benefit [1].

Now Dudley may object that code reuse is best achieved through an intermediate non-constructor function. But consider:
class X
    {
public:
    X(int i) : x (i), y (-i)
        {
        }
    X() : x (1), y (-1)
        {
        }
private:
    int const x;
    int const y;
    }
Such code can’t have the common initialization pulled into a single function, since the const members must be initialized in the constructors’ member initializer lists. Other than resorting to a default argument for i — which brings its own problems — there is no easy and reliable way to avoid code duplication here.

Point #3: For non-const members, you can apparently address Point #2’s code reuse problem via:
class X
    {
public:
    X(int i) : x (i), y (-i)
        {
        }
    X()
        {
        new (this) X(1); // HERE
        }
private:
    int x;
    int y;
    }
which mimics what I show in the original column. For POD types such as X (or the original circle), this tactic may well work, especially given that the destructors are trivial and may require no actual generated code. However, the behavior is undefined according to the Standard and is not guaranteed to work reliably or at all.

Dudley’s response exposes an editorial tension between what is sanctioned in theory, and what is possible and/or predictable in practice. For a particular implementation using this particular code, placement new within a constructor may well work. In my original response, I was addressing Diligent Reader Williamson’s specific context; I wasn’t intentionally recommending a universally guaranteed practice.

Yet given my position among the scribbling class, I need to be more conscious of how far my examples and advice will be taken. In this instance, I didn’t raise the warning about undefined behavior, or mark the solution as specific to the original problem, or to a particular environment and context. Scott Meyers and I talked this over on our way home, and at least in this regard, he’s persuaded me to be a more Diligent Writer.

Note

1. Woo-hoo! As I type, I’m attending the ISO C++ standardization meeting hosted by Microsoft in Redmond. Bjarne Stroustrup is fielding proposed extensions to the core language. A couple of committee members have suggested constructor reuse as a desirable future language feature.

Although Bobby Schmidt makes most of his living as a writer and content strategist for the Microsoft Developer Network (MSDN), he runs only Apple Macintoshes at home. In previous career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via BobbySchmidt@mac.com.