March 2002/Uncaught Exceptions

C/C++ Contributing Editors

Uncaught Exceptions: Space-Time Discontinuum

Bobby Schmidt

“Namelessspace?” How about Lost in Space? You decide, Gentle Reader.

Copyright © 2002 Robert H. Schmidt

I recently received my most breathtakingly off-topic reader question yet. Here it is, with the lightest of editorial dusting:

I built a two-channel 12-bit A/D card for connection to a parallel port. I tried to make a communication program in DOS for 12-bit communication from the card to a PC. I used 8 bits from the DATA register and 4 bits from (ACK, BUSY, PAPER, ONLINE). For handshaking, I used ERROR on the STATUS port. Now I don’t know how to get input from the DATA register although I tried to use EPP and ECP for bi-directional transfer. Is it possible to make data transfer in this way or should I use standard 8-bit communication?

Several thoughts intrude:

Was this question really meant for me? Yes, I could write the questioner back to find out, but I’m almost afraid to.

I use Macs. What’s a “parallel port”? And is it compatible with a perpendicular port?

Does anybody really do this kind of programming anymore? I’m now thinking that the question fell from a rupture in the space-time continuum and was actually targeting Dr. Dobbs ca. 1984.

Didn’t ACK go out with Bill the Cat?

Still...

Okay, I admit it: resistance is futile. This topic is so completely off course — light years off course — that I’m now strangely charmed and intrigued. So if one of you has a genuine answer to this reader’s question, please send it to me. If I get something that sounds right, I’ll go ahead and publish it. Even if it’s wrong but entertaining, I’ll probably publish it.

(God help me, but this gives me an idea for a regular column feature: The Spatial and/or Temporal Anomaly, a real question apparently meant for another place, time, or both. Chuck, you’d better intervene now, or I’m running with this!) [Get a grip, Bobby. Consider the exception “caught.” —The Intervener]

Sweet Home Translation Unit

Q

In your last Erratica item (January 2001), you quoted George Kaplan talking about ODR. I’ve seen mention of ODR before, but I’ve never known what it means. I’m hoping you can enlighten me.

I’ve been afraid to ask on the newsgroups. People there can be harsh, and I don’t want to be dismissed as a newbie wimp.

Thanks — R.O. Thornhill

A

At the risk of appearing Kelly-Bootlesque, I shall paraphrase everyone’s favorite X Files character, FBI Assistant Deputy Director Skynyrd:

Oo-oo, that ODoR.
Can’t you smell that ODoR?
Oo-oo, that ODoR.
The smell of defs around you.

ODR is a TLA for One Definition Rule. Given that name, you might think the rule requires everything in a program to be defined exactly once. The truth, as unraveled in the C++ Standard, is somewhat more complex:

Some things can be left undefined altogether.

Some things must be defined exactly once in a given program.

Some things must be defined exactly once in a given translation unit.

Some things can be defined in multiple translation units.

Some things must be defined in multiple translation units.

Reader Kaplan brushes up against the final two points during his spiel about namespaces and headers.

When you define the same entity in multiple translation units, each definition must have the same token sequence. Further, the resulting name resolution and binding must yield the same net meaning in all definitions. If the ODR did not allow such multiple definitions, you couldn’t #include most header files.

Once you inject using clauses into headers, the meanings of other definitions in the headers can vary, in violation of the ODR. This is particularly dangerous when headers make assumptions about or requirements on their context.

Consider the header trace_max.h:
#include <iostream>

extern inline void trace_max(int a, int b)
    {
    std::cout << max(a, b) << std::endl;
    }
which traces the maximum of two arbitrary integers. As an inline function, trace_max must be included and thus defined in each translation unit that uses it. And because trace_max is non-static, all definitions refer to a common function and must all be “the same” in accordance with the ODR.

That “sameness” can hinge on the meaning of max in the context of trace_max. For example, in the sequence:
int max(int a, long b)
    {
    return a > b ? a : b;
    }

#include "trace_max.h"
the max in trace_max resolves to the genuine global function ::max. But in the alternative sequence:
#include <algorithm>
using std::max;

#include "trace_max.h"
the max in trace_max resolves to the specialization std::max<int>, which — thanks to the using declaration — can be referenced unqualified as plain max.

You probably see the punch line: the two definitions of max differ, which means the two definitions of the common inline trace_max differ across translation units [1]. This difference violates the ODR, the program is ill-formed, and much hilarity ensues. Had the second sequence not injected std::max into the global namespace, the violation would never have occurred, for the compiler would have caught the missing max definition while parsing trace_max.

If you insist on contravening the ODR, you need not conjure headers and using clauses. These two simple source files will do the trick:
//
// file1.cpp
//
class X
    {
    int i;
    };

//
// file2.cpp
//
class X
    {
private:
    int i;
    };
There are a couple of lessons you should draw from this.

First, even though the two X definitions have the same net behavior, their token sequences are different, and that’s enough to trump the ODR. (Conjecture: the committee chose this conservative criterion because it’s easier to specify and validate.)

Second, class names have external linkage. In this case, all global-scope class X definitions must be the same across translation units — even if the definitions are logically distinct and just happen to share the same name! This may well be news to you and many others. Indeed, I suspect most of us reuse pet names for “local” classes scattered across our projects, believing the names are mutually ignorant and isolated as long as they aren’t in shared headers. Oops.

If you want to avoid this whole mess and have a class act is if it’s internally linked, define it in an unnamed namespace:
//
// file1.cpp
//
class X
    {
    int i;
    };

//
// file2.cpp
//
namespace
    {
    class X
        {
    private:
        int i;
        };
    }
Now the two X definitions refer to distinct things.

(Note to future historians: I am coining a new term. From now on, I shall refer to unnamed namespaces as “namelesspaces.” You read it here first.)

ODR-related bugs can be especially difficult to ferret out. Compilers and linkers typically miss many ODR violations and often allow offending programs to (apparently) work. This is added incentive to correctly package common externally-linked definitions in headers, hide non-common definitions in namespaces, and not mingle the two worlds via using clauses.

This Month’s Obligatory Scott Meyers Item

Q

“Mrs. Meyers” is my mother, not my wife. I’m afraid your stock is down sharply after I showed Nancy that. It’ll take more than lame music videos to recover from that faux pas, and I’m looking forward to seeing how you approach the problem :-) — Mr. Meyers.

A

Scott alludes here to the preamble leading my September 2001 column:

If you overlook a shameless play for Mrs. Meyers, Bobby serves up precious pointers on, well, pointers (smart ones anyway), and on the proper use of namespaces.

In my defense, I don’t write these initial blurbs. I don’t even see them before the magazine comes out. So if Nancy ~Meyers wants a perpetrator, she must look to the collective editorship of CUJ [2]. [I was unaware of the naming convention in Persephone’s household. —cda]

In reference to “lame music videos”: both Nancy and I share a fondness for early ’80s music videos, especially those involving European synth-pop [3]. In one of my last visits to Nancy, I brought her a gift of many such videos, thereby earning beaucoup points. Scott’s simply jealous, as his point count at home hovers near zero.

rotareti_tsnoc

Q

Hi Bobby,

The following code doesn’t compile with either VC++ 6.0 or gcc 2.95:
#include <vector>

void f(std::vector<int> &v)
    {
    std::vector<int>::
            const_reverse_iterator it
            = v.rbegin();
    while (it != v.rend())
        ++it;
    }
VC++ complains about assigning reverse_iterator to const_reverse_iterator. gcc complains about comparison.

My understanding of the C++ Standard is that, in this case, the versions of rbegin and rend that return non-const iterators should be called. But as iterators are convertible to const iterators, assignment and comparison should be OK. The same code with iterator, begin, and end substituted for reverse_iterator, rbegin, and rend compiles fine with both compilers. What I am missing? Isn’t it legal C++ code?

Thanks in advance — Ilya Levinson

A

I can hear Mr. Meyers in the back of the room going “Ooo, ooo, pick me, Mr. Kotter!”: Item 26 of his Effective STL involves this topic. I invite you to peruse that item for an explanation more complete than what I’ll give here.

We are accustomed to pointers to non-const things converting to pointers to const things:
X *p;
X const *cp;
...
cp = p;      // OK
if (p == cp) // OK
Such conversions are “safe” in that pointers to const cause less damage than pointers to non-const.

Since STL iterators are generally modeled on pointers, we should expect that iterators will follow the pointer pattern of safety and implicit conversion:
X::iterator p;
X::const_iterator cp;
...
cp = p;      // OK?
if (p == cp) // OK?
Yea verily, the C++ Standard requires that an STL container’s iterator type be convertible to that container’s const_iterator type. As Scott discusses, not all libraries honor this requirement, although apparently yours do.

Your problem is specifically about reverse iterators. As there are no reverse pointers (such that ++ moves backward), we can’t appeal to the behavior of pointers for guidance. At the same time, it seems reasonable to expect the conversion relationship between regular iterators to also exist between reverse iterators.

Reasonable, but apparently not required, for I can find no passage in the C++ Standard compelling a conversion from reverse_iterator to const_reverse_iterator. I have to believe the intent is there; I just can’t find the wording. So based on my strict reading of the Standard, the behavior you are getting is Standard-conformant, if inconvenient. (If any Diligent Reader knows a place in the Standard requiring such a conversion, please write me.)

Scott suggests one possible fix: always compare mixed const/non-const iterators so that the iterator to non-const comes first. In your case, that means you’d rewrite the comparison as:
while (v.rend() != it)
While I can’t guarantee this will fix your problem, it’s worth a shot.

Another possibility: initialize a second const_reverse_iterator and then compare the two const_reverse_iterators:
std::vector<int>::const_reverse_iterator it
        = v.rbegin();
std::vector<int>::const_reverse_iterator rend
        = v.rend();
while (it != rend)
    ++it;
Otherwise, you may have to either go with reverse_iterators instead of const_reverse_iterators or give up on reverse iterators altogether. (In his Item 26, Scott recommends you use plain old non-const non-reverse iterators anyway, to avoid these and other problems.)

Parent(hetic)al Rights

Q

In a piece of code I recently wrote:
a = b - c - d;
I intended left-to-right ordering, so that the code would act as if I wrote:
a = (b - c) - d;
However, the compiler produced code equivalent to:
a = b - (c - d);
My compiler support person at Green Hills says I can’t rely on the order. (It mildly surprised me that the spec would leave this open.) He then astounded me by claiming that even:
a = (b - c) - d;
does not guarantee evaluation order. Instead, I have to do
tmp = b - c;
a = tmp - d;
Is he right? The only point in having parentheses is to define the way things should be combined. I know that the time evaluation order is undefined; I think he is mixed up with this issue.

Your expert advice will be helpful — and if you can quote the appropriate K&R I will pass it on to Green Hills. (Unfortunately I don’t have a copy — isn’t there an “online” definition of the language?)

Thanks — Yonatan Lehman

A

You really have three questions here:

What is the associativity of a = b - c - d?

What is the evaluation order of a = b - c - d?

Where can you get the language standards online?

You don’t say in your mail, but I’m assuming you are using built-in types (in either C or C++) and no macros.

Question 1: The associativity of:
a = b - c - d;
is fixed by each language’s standard [4]. As the additive operators binary + and binary - have left-to-right associativity, your original expression should behave as if written:
a = (b - c) - d;
I’d be fairly surprised if the Green Hills compiler gets this wrong. Are you sure you’re interpreting the code and results correctly?

Question 2: The Green Hills support person is correct. In the expression:
a = b - c - d;
you don’t know which of the subexpressions b - c or d will evaluate first — the actual ordering is unspecified.

If you aren’t generating side effects in b - c - d, and if you avoid overflow and underflow, the evaluation order shouldn’t be significant. If the order does matter, the support person’s suggestion will work, by interposing a sequence point between the subexpressions:
tmp = b - c;
//         ^
// sequence point here ensures
// b - c evaluates before d
//
a = tmp - d;
Question 3: The C and C++ Standards are not legitimately available for free; you need to pay ISO for copies. There was a time you could find the drafts online. For all I know, you could still stumble into them via a web search.

While I’m largely sticking to my new policy of not quoting or attributing the Standard, I’ll make an exception here, since you specifically asked:

The C99 Standard covers expressions generally in Subclause 6.5 and additive operators specifically in Subclause 6.5.6. There is also a related example in Subclause 5.1.2.3/14.

The C++ Standard covers expressions generally in Clause 5 and additive operators specifically in Subclause 5.7.

If you insist on referencing K&R, know that I have only the early (1978) pre-Standard version. There you can find what you want on page 48, in the section numbered 2.12 and titled “Precedence and Order of Evaluation.”

Notes

[1] There’s an excellent chance the implementation of std::max<int> is the same as that of the stand-alone ::max — after all, there are only so many decent ways to implement a max function. But even then, the parameter lists won’t match: the specialization has two ints, while the stand-alone function has an int and a long.

[2] For those keeping score, Scott’s bride goes by her pre-marriage surname Urbano. So does their dog Persephone. This is actually high praise for Nancy, since Persephone is fairly deified in their home. Come to think of it, both Urbanos are. Maybe Scott should change his name, for at the moment he’s really (get ready for it) Suburbano.

[3] I was a DJ at my college’s FM radio station from 1979 to 1985. European synth-pop was a staple of our new-music rotation and thus infiltrated my brain, as well as my subsequent music collection.

[4] At least implicitly. As neither Standard explicitly lists operator precedence and associativity, you must infer them from the language grammars.

Although Bobby Schmidt makes most of his living as a writer and content strategist for the Microsoft Developer Network (MSDN), he runs only Apple Macintoshes at home. In previous career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via BobbySchmidt@mac.com.