February 1999/Uncaught Exceptions

C/C++ Contributing Editors

Uncaught Exceptions: September Song

Bobby Schmidt

Bobby deals with time warps, from some news that's old hat to old songs that are news to some.

Copyright © 1999 Robert H. Schmidt

To ask Bobby a question about C or C++, send email to cujqa@mfi.com, use subject line: Questions and Answers, or write to Bobby Schmidt, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.

When I worked at Microsoft, what was brand new to the customers was old to technical support (since they had endured the beta), and older still to the developers, who were already coding up the next version. This made for some cognitive dissonance: when asked about a "new" bug in the (customers') "current" version, developers would often say it was an old bug already fixed in the (developers') "current" version. Besides, the developers often had long forgotten how the customers' code worked, since to them it was ancient history.

Users think in terms of discrete product versions like 2.0, which to them is very distinct from last year's version 1.32 and stays current until the next version comes out. But to the product's developers, version 2.0 is really a marketing label for internal version 512, which is one point in the continuum of daily builds. By the time a customer asks a question about version 2.0 (a.k.a. daily build 512), the developer is working on daily build 784.

I'm just like those Microsoft developers. Most of this month's questions arise from last September's column — which for me is really last June's column, since that's when I first wrote it. Thus, I'm responding to a column written about seven months before you are reading this. As I've said before, time is not linear in the column-writing world.

(If you're counting, this column is monthly build 38 for me, and version 2.0 release 3 for you.)

Horseshoes and Hand Grenades

Q

I've got a question about a topic covered in your September 1998 column in C/C++ Users Journal. I apologize if this seems late, but the office copy just now was routed to me...

Okay, so the template function LengthOf<> is pretty cool, but I've run into some situations where it just plain doesn't work like the macro version.

The template function implies a run-time evaluation, so I can't use it to declare arrays of a given size on the stack. For example:
// Example #1
//
void FooBar ()
{
    int array1[10];
    bool array2[LengthOf(array1)];
    // Do stuff...
}
The second situation where the template function fails is when I try to use it on a locally-declared (e.g., inside a function) type. I can work around this problem by moving the declaration of the type outside the function, but that's not very satisfying:
// Example #2
//
void MoreFooBar ()
{
    struct MyType
    {
    int one;
    bool two;
    };
 
    MyType array[10];
 
    For (int i = 0;
         i < LengthOf(array); ++i)
        {
        // do stuff...
        }
}
I tried to devise a workable solution that would always allow compile-time evaluation, but I couldn't quite come up with one. The idea that I have is to somehow convert the dimension to an enumerated value, which would allow me to access the value directly at compile time (I wouldn't have to evaluate a function). I tried things along the following lines:
template <class T, size_t dim>
class DimensionOf
{
    enum { dimension = dim };
};
But I could never get anything along these lines to work, because I have to explicitly name the template arguments to access the enumeration. So, are there any variations that also work under the above scenarios? (Apologies if this has been covered in a recent issue; I just haven't seen it yet.) — Matt Hainje

A

What a delicious irony: after a column prologue discussing Microsoft developers, my first question comes from... a Microsoft developer!

For those who tuned in late, the template in question is
template<class T, size_t N>
inline size_t lengthof(T (&)[N])
    {
    return N
    }
As an example,
int main()
    {
    int a[10];
    cout << lengthof(a) << endl;
    }
when run should yield
10
I offered lengthof as an alternative to the familiar C idiom
#define LENGTHOF(a) (sizeof(a) / sizeof(*a))
in large part because LENGTHOF can't tell real arrays from pointers:
int a[10];
int *p = a;
cout << LENGTHOF(a); // yields 10
cout << LENGTHOF(p); // probably yields 1 or 2, typically
As you have unfortunately discovered, there are places where lengthof does not work:

Example #1 fails because an array dimension requires a constant expression. Even though the value yielded by lengthof might be resolved at compile time, it is still not considered a constant expression.

Example #2 fails because local types (like MyType) cannot be template parameters. In practice, this should be a minor limitation — in my experience, almost nobody uses local struct/class types.

You can fix Example #1 with the variation
int *a2 = new int[lengthof(a1)]; //OK
You must of course remember to use
delete [] a2;
once you are done with a2. To eliminate this bookkeeping chore, write an auto_ptr-like class that calls delete [] in its destructor, allowing
AutoArrayPtr<int>
    a2(new int[lengthof(a1)]);
Now the allocated array will self-delete when a2 goes out of scope.

To fix both examples, you need the macro solution:
MyType a1[10];
MyType a2[LENGTHOF(a1)]; // OK
To make this solution more palatable, give the macro a little intelligence:
#define LENGTHOF(a) sizeof(a) == sizeof(&*a) ? 0 : sizeof(a) / sizeof(*a)
If the argument a has the same size as a pointer to *a, a is probably itself a pointer and not an array. I say "probably" because, in the rare event that a is an array that happens to be the same size as a pointer, the macro will yield a false positive. This won't lead to run-time bugs; rather, it means the macro is slightly conservative, flagging code that is actually okay.

In any event, if the macro believes a is a pointer, it yields zero — an expression the compiler diagnoses as an invalid array size. I pick zero here arbitrarily; you can use any expression that will fail to compile.

Otherwise, if a and &*a have different sizes, a is an array and the macro yields a's length as desired [1].

Examples:
int a1[100];
int a2[1];
int *p;
int b1[LENGTHOF(a1)]; // OK
int b2[LENGTHOF(a2)]; // maybe error
int b3[LENGTHOF(p)]; // error, but...
int n = LENGTHOF(p); // ...OK, sadly
I have not found a simple way to make LENGTHOF reject all pointers in all contexts. If any Diligent Reader has a superior solution — or can fix the original lengthof template to better meet Matt's requirements — please let me know.

The Truth is Out There

Q

In the latest C/C++ Users Journal I just received you had a question as to why the this pointer is an rvalue. I think one reason you can't do
T *const p = &this;
or similar is that it's up to the compiler where the this pointer actually resides.

For instance if the this pointer was in a register during a member function call then there would be no address available (although I think some processors map the register set onto the first few addresses in memory). — Scot Shinderman

A

While yours is certainly an argument for making this an rvalue, I'm not convinced it's a compelling argument.

Remember, this had been an lvalue for many years. Presumably implementations that kept this in a register got around the conflict by copying this to addressable storage when needed. Because this was considered a const pointer, the addressable copy couldn't be changed, and the two versions — one register original, one addressable copy — would stay in sync.

Until I see a rationale for the C++ Standard, or otherwise hear from committee members who know the truth, I'll remain agnostic about the real reason behind this becoming an rvalue.

What's the Vector, Victor?

Q

I have always enjoyed your column and find the insights it brings to me on C++ are often remarkable. However, I did have trouble understanding the part of your column in September under the heading "What If..." and even noted a few mistakes.

First, v1.insert(v2.begin(), 1) is invalid since the first parameter to insert must be an iterator for the vector you are inserting into. I presume you meant v1.insert(v1.begin(), 1).

Second, you cannot use an iterator after it has been possibly invalidated, for example by an intervening call to insert. In your example, p1a and p2a will be the same and will compare as equal if p1a is valid; you cannot compare them at all if p1a is invalid. You can ensure that p1a is valid by using function reserve — see below.

Third, the whole idea of freezing or taking a snapshot is complicated and unnecessary. You can make sure that an iterator remains valid by reserving enough storage before saving the iterator and performing the insertion (or any operation that may force the array to be reallocated). For example:
vector<int> v1;
// ...
 
// Ensure there is space for 3 more
// elements without reallocation
v1.reserve(v1.size() + 3);
vector<int>::iterator p1a = v1.begin();
 
// Insert 3 elements (with value 1)
v1.insert(v1.begin(), 3, 1);
 
// You can still safely use p1a
// (eg. compare p1a and p1b)
vector<int>::iterator p1b = v1.begin();
Alternatively, and probably much better, don't use any iterator values from before an operation that possibly caused a reallocation. You can usually reevaluate the iterator or use an index into the vector (operator[]) rather than an iterator.

Fourth, I find the idea of limiting vector sizes to 65,535 bytes for portability rather poor. There is a lot of software written that is ported to many implementations, none of which have a 64K limit on array sizes.

Even if you do need to port to such an environment I do not see that the software for all environments should be encumbered by the restrictions of the one. This limitation could be handled at run time at least by catching bad_alloc. A better way would be to check the maximum number of elements by calling vector::max_size before any operation that may cause a reallocation.

I hope this makes sense and I apologise if I have made a major blunder or misunderstood your article. As I said before, I find most of you columns very informative and useful, but felt I should mention these points.

Yours sincerely — Andrew Phillips

A

I'll answer your four points with four of my own:

Point 1: I meant v1.insert(v1.begin(), 1). In fact, I don't know why I even declared v2 at all here.

Point 2: Upon reflection, I think we are saying the same thing in different ways. In my example I say that p1a (the pre-insert iterator) and p2a (the post-insert iterator) "conceptually" reference the same place. That is, in the programmer's mind, p1a subjectively references the container's start — even though, after the insertion, p1a may objectively reference outer space.

I was trying to draw a distinction between the user's interpretation of p1a and the language's implementation of same. I think the gist of your argument is, regardless of what the programmer is thinking, vector semantics require that programmer to consider p1a a dead iterator, Q.E.D.

As a programmer I adopt your perspective. If an interface spec says an iterator becomes invalid under certain conditions, and those conditions arise, it doesn't matter what my brain tells me ought to be happening, or whether I think the restriction is ridiculously conservative. I code to the spec, not to my wishful thinking.

But as a writer I have to take my readers' mindset into account. Less-experienced programmers often create bugs by coding to their mental model. (Whether they ought to do this or not is a different matter.) I can easily imagine an STL newbie holding on to p1a for the vector's entire lifetime, as if the vector acted just like an array.

Further, if I were in my readers' place, I'd want to know why the spec had such a restriction, and why my mental model was in dissonance with the implemented programming model. Rather than just write "don't do this because the Standard says not to," I prefer to also give some rationale for why the Standard says not to. That way, the mental model has a chance to align with the Standard's reality.

Point 3: For what it's worth, I knew the freezing example was horrid when I published it. I actually got my inspiration from MFC a few years ago. I believe the container was a CString or some similar string analogue; to get an iterator (char const *) to the CString, I had to freeze the contents so they couldn't change. As you rightly show, vectors support much saner alternatives [2].

Point 4: As Diligent Readers know, I try to bring up portability concerns where I can. The C and C++ Standards committees have done a lot of work promoting portability, and I'd feel remiss assuming all my readers have implementations exceeding the Standard-specified minima.

The 64K portable limit for a single C++ array is my extrapolation from the C Standard. I expect that some conforming C or C++ implementations — especially for embedded systems — are limited to single arrays/vectors of size 64K. How many of my readers have systems so-constrained I can't know.

Yes, you can ask a vector for its maximum size, and can catch exceptions. But these are run-time checks that may cause a behavior fork: the same source code can have one limit on System A, another limit on System B. If the code is to behave consistently and have the same published properties on all systems, I believe such run-time checks are insufficient.

I would rather say that, if you reliably want a container > 64K and care about portable behavior across systems, use something other than an array or vector. Otherwise, exploit your system's particular implementation limits. Regardless, I'll continue to favor portability considerations when writing my columns.

White Rabbit

Q

When I saw "Life is change" at the beginning of your September 1998 column, I immediately went to the end-note to read the source credit, thinking, "Wow, here's a kinsman who quotes arcane song lyrics to make universal points!" I was a bit disappointed to find only a reference to the current husband of an ex-wife.

Anyway, I recall that truism from the following:

Life is change
How it differs
From the rocks.
I've seen their ways
Too often for my liking:
New worlds to gain.
My life is to survive
And be alive
For you.

From: "Crown of Creation" (postlude), Jefferson Airplane, September, 1968

(I've not seen the stanzas in writing; the line breaks above suggest the meter present in the music.)

Thanks for a useful and interesting column. — S.L. Sander

A

I had just turned seven when this song came out. At that age I was listening to what my parents listened to — mainly jazz, "old" country, and folk songs. To this day, I doubt that either of my parents has heard (or heard of) Jefferson Airplane.

I probably didn't hear them until I was in high school. I certainly don't remember the song you cite. I do sometimes quote arcane song lyrics, but they tend to come from my college DJ years ('79-'85).

Sorry my reference to "life is change" was so prosaic — next time I'll conjure up something both witty and obscure!

And This One Belongs to the Reds

Q

I have a function definition:
void
Func(long lErrorCode, const char * szFormat, ...);
In this function I want to call sprintf with szFormat and the ellipsis parameters from Func. Now I want a short way to write the function call. I found a way, but then I have to code all possible numbers of arguments. I think there must be a better way.

Any suggestions?

Thanks — Volker Grimm

A

Woo-hoo, I get to end the game fielding an easy grounder.

Volker, the secret is contained in the Standard C header files <stdarg.h> and <stdio.h>. There you'll find macros, types, and functions that let you access the unnamed (ellipsis) arguments and feed them to printf-type functions:
#include <stdarg.h>
#include <stdio.h>
 
void
Func(long lErrorCode,
    char const *szFormat, ...)
    {
    if (lErrorCode != 0)
        {
        char buffer[1000]; // magic #
        va_list va;
        va_start(va, szFormat);
        vsprintf(buffer, szFormat, va);
        va_end(va);
        }
    }
The va_list object holds unspecified context information used by subsequent macros and routines. In effect, the va_list "captures" the values contained in the unnamed arguments.

va_start fills in this context, based on the position of the last named parameter (szFormat in this example). You must match va_start with va_end, using the same va_list object.

vsprintf behaves just like sprintf, except it picks its values from the va_list rather than from an explicit list of arguments. You can also use vfprintf and vprintf, which are analogous to "normal" fprintf and printf.

va_list, va_start, and va_end are declared in <stdarg.h>, while the printf-style functions come from <stdio.h>. All of these declarations are part of both Standard C and Standard C++.

Notes

1. This all assumes you are using built-in * and &. If you feed LENGTHOF a class object with those operators trumped, all bets are off.

2. If you are an MSJ junkie, or have @microsoft.com in your email address, please please please resist your primal urge to write me about alternative CString strategies. The project was long ago, and now serves only as an inspiration.

Bobby Schmidt is a freelance writer, teacher, consultant, and programmer. He is also an alumnus of Microsoft, a speaker at the Software Development and Embedded Systems Conferences, and an original "associate" of (Dan) Saks & Associates. In other career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via rschmidt@netcom.com.