August 2001/Uncaught Exceptions

C/C++ Contributing Editors

Uncaught Exceptions: Nevermind

Bobby Schmidt

Bobby is at his enthusiastic best after deleting a job that wasn’t what it seemed. Deleting an object in C++ isn’t always what it seems, either.

Copyright © 2001 Robert H. Schmidt

“Here we are now, entertain us.” [1]

Remember all that stuff I wrote in this space three months ago? About how I was poised to become content strategist and editor-in-chief for MSDN’s online .NET portal/publication? And how I wanted to jazz up the site with strong editorial personality?

Never mind.

As I wrote at the end of that opening spiel: “All of this assumes I actually get to do what I want to do.” Well, I didn’t get to do what I wanted. I quickly discovered that the job was more tangled and encumbered than I had foreseen.

MSDN and, indeed, many of Microsoft’s online faces are undergoing a dramatic transformation over the next year. These changes affect both the outward content, community, and presentation, and the inward relationships with partners and contributors. While I’m hopeful that the end result will be a demonstrable improvement, the transition involves way too many meetings, PowerPoint decks, content plans, and shifting strategic priorities for my taste.

But the trump card was my writing.

I had intended to spend roughly half my time as editor/strategist, and half as writer. But the first job quickly grew to consume all of my time; my writing languished, and my enthusiasm for the new job waned. At this year’s SD West, a dinner and subsequent lunch with Scott Meyers convinced me: my passion lies in my teaching and writing. I needed to ditch the new job and get back the old, even if that meant leaving MSDN or possibly Microsoft.

So after a six-month hiatus, and no small amount of wrangling with Microsoft, I’m back to writing full time. When you read this, the first part of my next project should appear on MSDN: a C# handbook for C++ programmers [2]. This project is an experiment for me; rather than write a column series or a traditional book, I want to create a more organically hyperlinked writing constellation — living work incrementally published and refined online over many months.

In essence, I’ll take approaches we use in software component design and try applying them to English and technical prose. I know this idea is not new to the world, but it is new for me, and I’m honestly excited by the challenge and opportunity.

Deletion Revisited, Part 1

Q

Dear Bobby,

I am confused about the “Deletion Detector” question in the May 2001 issue. In
X *p = new X;
delete p;
the issue seems to be whether all the destructor code completes and not whether the memory for the X object is freed. I say this because you set the boolean flag at the end of ~X.

My confusion is: How can the destructor code possibly not complete? The only answer I can come up with is that an exception propagates out of the destructor. Such an exception can be caught with:
try
    {
    delete p;
    }
catch (...)
    {
    // do something
    throw; // rethrow
    }
I would appreciate your comments.

Thank you — Joe Hesse

A

You are actually asking about my first proposed solution, which flags object deletion in the object’s destructor. My second solution — the one I enshrined as Listing 1 in that column — flags object deletion in a class-specific operator delete.

The original reader wanted to know when an object had been deleted. In my first solution, I made three assumptions:

“Deletion” means the object’s destructor has been called and its memory freed. This implies that the object was previously allocated dynamically, and that its constructor succeeded.

If the destructor completes, the memory deallocation will succeed — that is, operator delete will never throw. Because of this assumption, in my first solution, I equate successful deletion with successful destruction.

A user may want to test for successful deletion at a distant code point, long after the deletion may have failed.

You are right that the destructor should always succeed unless a thrown exception intervenes. Given that, the object’s state flag seems redundant — if no exception is thrown, the destruction is successful, and (by my implication) the deletion is successful.

But remember: just because an object’s destructor has been called doesn’t mean you can’t try to reference that object, or that the object doesn’t still exist in physical fact. This is true even if the object’s destructor throws:
X *p = new X;
// ...
try
    {
    delete p;
    }
catch (...)
    {
    }
// ... 100 lines later...
p->i = 1; // oops!
delete p is a logical deletion that may or may not induce an immediate physical deletion or even alteration. The flag lets users know a logical deletion has occurred even if no physical one has and lets them know it long after the logical deletion (and potential failure) occurs.

Deletion Revisited, Part 2

Q

Your May 2001 Uncaught Exceptions included a “Deletion Detector” class to determine whether a pointer has been deleted. You said the code “falls within the language rules,” but I think you ran afoul of one. Clause 5.10 (Equality Operators) of the C++ Standard says:

Pointers to objects or functions of the same type [...] can be compared for equality.

Once memory for an object has been freed, a pointer to that memory is no longer a pointer to an object, and comparisons involving that pointer are undefined. I’ve read of platforms where an attempt to compare an invalid pointer to another pointer causes a crash.

I still like your idea (although I might replace your allocations_ vector with a map), and I believe it will be useful to many readers, but, sadly, it isn’t Standard C++. — James M. Stern

A

Good catch! I’m sorry to say that I hadn’t even considered that angle.

For other readers: if you wonder about Diligent Reader Stern’s observation, take a look at Listing 1 from my May column. Ponder in particular this statement:
list::iterator const location =
        std::find(allocations_.begin(),
        allocations_.end(), p);
The STL algorithm std::find evaluates the statement
*it == p
where it iterates over the range of allocations_. It makes this evaluation for each element in allocations_ until one of two conditions occurs:

*it == p evaluates to true

no more elements are left to test

*it has type void * and is the address of some currently-allocated object. p has type void *, but may or may not hold the address of some actual object; indeed, that uncertainty is what motivates this whole enterprise.

If p in fact points to an object, the expression *it == p compares two valid pointers for equality, in conformance with the subclause James cites. But if p points to an object that no longer exists, or holds some other invalid address, then it can’t reliably be compared to another (valid) pointer.

James claims that the resulting behavior is undefined; yet my interpretation of the Standard suggests that the behavior is actually unspecified [3]. If I’m right, then the rumored crashing behavior James mentions shouldn’t occur. But even if I’m wrong, the behavior isn’t necessarily predictable or even documented.

There is a possible solution. James describes systems that bomb if you compare deleted pointers. However, if those systems let you convert deleted pointers, you may be in luck. From Subclause 5.2.10/4:

A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is implementation-defined. (Note: it is intended to be unsurprising to those who know the addressing structure of the underlying machine.)

On many systems, casting a pointer to an integer results in no change to the underlying bit representation, and no extra generated code. Indeed, the act of casting a pointer to an integer, then back to a pointer, must yield the original value [4].

Given that constraint, I’m assuming that each unique pointer maps to a unique integer. If two such integers compare as unequal, I further assume the original (pre-conversion) pointers were unequal [5].

If your system allows such conversions, and if my assumptions hold, you can replace my original operator new and operator delete with
void *operator new(size_t n)
    {
    void *const p = malloc(n);
    V const v = reinterpret_cast<V>(p);
    allocations_.push_back(v);
    return p;
    }

void operator delete(void *p)
    {
    V const v = reinterpret_cast<V>(p);
    allocations_.remove(v);
    free(p);
    }
where V is the integral type to which pointers map (typically unsigned or unsigned long). V becomes the new value type for the list allocations_:
private:
    typedef unsigned long V;
    typedef std::list<V> list;
which now holds integers instead of pointers. Similarly, the STL algorithm finds integers instead of pointers:
V const v = reinterpret_cast<V>(p);
list::iterator const location =
        std::find(allocations_.begin(),
        allocations_.end(), v);
This solution avoids James’s problem, by converting the possibly-deleted pointer p to an integer before making the implicit comparison in std::find.

Caveat: if you have a system so restrictive that you can’t even fetch the value of a deleted pointer, then this solution won’t work. In that case, you’re probably stuck with some nasty system-specific hack.

Ins and Outs, Part 1

Q

Hi Bobby,

I have a “simple” question that so far has avoided a solution: how do I forward declare a nested class?

Consider
Outer::Inner *p;
As p is only a pointer to an object, I shouldn’t have to include the relevant header files for Outer and Inner. Instead I can merely forward declare those classes (or so one would think). But how do I write the forward declaration for Outer::Inner?

Here is how not to do it:
#include "Outer.h" // No, I just want a
                   // fwd declaration

/*
None of these work with MSVC++ 6.0

class Outer::Inner;

class Outer;
class Outer::Inner;

namespace Outer
{
    class Inner;
}
*/
The nested classes that provoked this question are generated by a third-party tool and involve some pretty hefty header files. Thanks. — Billy O’Mahony

A

You can’t do what you want in Standard C++. Assuming Outer’s definition is
class Outer
    {
public:
    class Inner
        {
        // ...Inner members
        };
    // ... other Outer members
    };
the best you can do is
class Outer
    {
public:
    class Inner;
    // ... other Outer members
    };
which defines and declares Outer, but only declares Outer::Inner.

A class declaration introduces a class’s name, but not the class’s member names. To introduce a class’s member names, you need to define — and not just declare — that class. From Subclause 9.2/1:

The member-specification in a class definition declares the full set of members of the class; no member can be added elsewhere.

You want a class declaration (Outer) that somehow introduces a member name (Outer::Inner), and that can’t happen. If the language worked the way you want, the declaration of Outer::Inner would effectively add that member (or at least its name) to the class Outer before Outer is actually defined.

Ins and Outs, Part 2

Q

My colleagues and I have a problem with nested classes and templates. We stumbled across it when trying to compile automatically generated code.

We have a class that contains a nested class and a template-based vector of that nested class. We provide an accessor function to the vector:
template<class other>
class vct
    {
    // ...
    };

class outer
    {
public:
    class inner
        {
        // ...
        };
    vct<inner> data;
    vct<inner> &GetData();
    };
The function GetData is defined inline outside of the class declaration but within the same header file. Our code-generation tool (Rational Rose) produces this definition for GetData:
inline vct<inner> &outer::GetData()
    {
    return data;
    }
This function compiles with Sun WorkShop Compiler 4.2 patch 104631-07. However, it doesn’t compile with MS Visual C++ 6.0 SP4 and HP ANSI C++ B3910B A.01.15; they require the scope resolution of the template class parameter:
inline temp<outer::inner> &outer::GetData()
    {
    return data;
    }
If we set up Rational Rose to generate the latter version, it also includes the full scope resolution within the class declaration, i.e.:
vct<outer::inner> data;
vct<outer::inner> &GetData();
which doesn’t compile with MS Visual C++. Of course, we can edit the generated code by hand and everything works, but we are also interested in what exactly happens here and what would be correct according to the C++ Standard.

Thanks in advance. — Marco Hahn

A

Listing 1 shows the “correct” version. The parts surrounded by /* */ are optional; if you uncomment those parts, the code is still correct.

When a Standard-conforming compiler sees a template argument name (such as inner), it tries to resolve that name in a Standard-specified lookup order. The complete rules governing that order are a tangle [6], and I won’t explore them here. Simplistically the order is from the “inside out,” meaning more locally-scoped names are considered before more globally-scoped ones.

In your example, when the compiler parses the name inner as part of the GetData declaration:
class outer
    {
    // ...
    vct<inner> &GetData();
    };
the innermost scope is within class outer. The compiler is therefore allowed to look for inner as a member of outer. When the compiler later sees inner in the GetData definition:
inline vct<inner> &outer::GetData()
//         ^ parse here
the innermost scope is global. Names in outer and other non-global scopes are not considered.

To make the compiler happy, you must change the name to something it can resolve at global scope:
inline vct<outer::inner> &outer::GetData()
//         ^ parse here
Using the same lookup rules as before, the compiler searches for outer in global scope. Having found such an outer, it then looks for an accessible type member inner within the scope of outer. Mission accomplished.

Now for a couple of twists.

If you change GetData to include a function argument:
class outer
    {
    // ...
    vct<inner> &GetData(inner);
    };

inline vct<inner::outer> &outer::GetData(inner)
the program is still correct. This presents an apparent inconsistency. In the GetData definition
//         v    parse point #1
inline vct<inner> &outer::GetData(inner)
//              parse point #2    ^
the function argument inner appears successfully without qualification (parse point #2) — yet if you try the same unqualified name as a template argument (parse point #1), the compilation fails.

At point #1, the compiler doesn’t yet know that it’s parsing the declaration of an outer member and doesn’t consider the scope of outer for name lookup. But once the compiler gets to point #2, it knows that it is indeed parsing an outer member and looks up the function argument inner within outer’s scope [7].

The second twist comes when you define GetData within the definition of outer, as I show in Listing 2.

When the definition physically appeared in global scope, the template argument had to be looked up in global scope. But now that the definition physically appears within outer, the template argument can be looked up in the scope of outer. The two definition forms are logically equivalent, regardless of their physical location; yet that location affects how the definition’s names are resolved.

Many of us grow up thinking that where we physically define functions is largely a matter of taste governed by file-organization concerns. But as this example shows, such thinking is too simplistic, especially when templates are involved [8].

Notes

[1] Nirvana. “Smells Like Teen Spirit,” Nevermind.

[2] My guiding inspiration is the astronomy classic Burnham’s Celestial Handbook: An Observer’s Guide to the Universe Beyond the Solar System (Dover, 1983).

[3] C++ Standard Subclause 5.9/2, indirectly via 5.10/1.

[4] Subclause 5.2.10/5.

[5] As far as I can tell, the Standard does not guarantee these assumptions of uniqueness. However, I have trouble imagining a system that could distill two pointers down to the same integer, yet somehow be able to rehydrate that single integer back to the original two pointers.

[6] And as Scott Meyers would remind us, the tangle is particularly thick because the name lookup rules for templates and functions differ subtly but significantly. Vive la difference!

[7] This is a consequence of Subclause 3.4.1/8 as amplified in Notes 29 and 30.

[8] This is true even for non-template types. For example, if GetData returned inner rather than vct<inner>, the lookup problem would be the same. However, I find that the problem is greatly exacerbated when templates enter the scene.

Although Bobby Schmidt makes most of his living as a writer and content strategist for the Microsoft Developer Network (MSDN), he runs only Apple Macintoshes at home. In previous career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via BobbySchmidt@mac.com.