C/C++ Contributing Editors


Uncaught Exceptions: Phantom MenaC++

Bobby Schmidt

The keyword static has its notorious foibles. So too does const. Mix in some Microsoft-specific behavior and Bobby has much to explain.


Copyright © 1999 Robert H. Schmidt

By the time you read this, Star Wars Episode I will be upon us. To get in the mood, I just downloaded the 25 MB trailer from Apple's web site last night. Hard to believe that when the last episode opened I was what, 21? Harder still to believe that I thought the original Star Wars looked dumb in the 1977 TV ads; the only thing that caught my attention was the Vivaldi music in the background.

During my month off playing house move, I received a disturbance in the Force from a reader who shall remain quite anonymous. His letter began:

"I have been trying to write the below listed program for two weeks and can't seem to get on the right track. I am new to C++ programing. I got this program out of a magazine and I need your help. If you could please write out this program for [me] to see where I am going wrong I would appreciate the help."

I guess Usenet was busy that day, so he mailed me instead. Such a pity that I somehow never found time to write back. Had I answered him, imagine the irony — not to mention the opportunity for infinite publishing recursion — if the magazine he got the original program from was ours.

Deep Blue

Q

If a computer blue-screens in a forest, but no one is around to witness it, is the behavior a bug? Thanks. — J. Reno

A

While philosophers and product support people are somewhat divided on the answer, I think most would consider the behavior an undocumented feature, not a bug. Left undecided: if the computer blue-screens after being hit by a falling tree, does it make a sound?

Weak Link

I hope you can answer this for me. Using VC++ 5 and 6, I've created a .cpp module that contains the definition

const unsigned char DTMF_9[] =
    {
    // ...
    };

In the header file for this module the prototype is

extern const unsigned char DTMF_9[];

The program compiles but the linker complains that

ModemIO.obj : error LNK2001: unresolved
    external symbol "unsigned char const
     * const  DTMF_9" (?DTMF_9@@3QBEB)

The linker appears to be looking for the wrong thing. If I remove const from both the definition and prototype the link is successful. Is this a problem in VC++ or is there something about const I don't understand?

Obviously I can work around this, but I'd like to understand what is going on. — Stan Burton

A

I believe you've stumbled into an ambiguity in the C++ Standard. If it's any consolation, every compiler I've tried shows similar symptoms. First I'll tell you what I think is going on, then I'll explain the ambiguity.

Global C++ objects declared const default to internal linkage, as if they'd been declared explicitly static [1]. Your array is global and declared const. This suggests the array should have internal linkage, as if it were really declared

static const unsigned char DTMF_9[]

In that scenario, DTMF_9 is invisible outside its defining translation unit. Other source files that include the header and its declaration

extern const unsigned char DTMF__9[]

can't see the "real" DTMF_9, but will instead reference some other (non-existent) DTMF_9 with external linkage. When the linker tries to resolve these references to an external DTMF_9, it can't find one, leading to the error message you see.

Once you remove const from both the source file's definition and the header's declaration, the problem changes: just as global const objects default to internal linkage, global non-const objects default to external linkage [2]. In your scenario, the definition

unsigned char DTMF_9[]

is equivalent to

extern unsigned char DTMF_9[]

Now the header and the defining .cpp file match. Other translation units can reference this version of DTMF_9; the linker, in turn, can resolve those references, making the error messages go away.

The above explanation certainly describes your symptoms. Problem is, it relies on a conflicted interpretation of the C++ Standard. In your declaration

const unsigned char DTMF_9[]

which is really the same as

unsigned char const DTMF_9[]

the const applies not to the DTMF_9 object, but to the unsigned char elements it contains. We learn this from two passages in the Standard, 3.9.3p2 ("CV-qualifiers"):

and 8.3.4p1 ("Arrays"):

Because the DTMF_9 object itself is non-const, I'm led to a tentative conclusion: DTMF_9 should not have internal linkage.

However, this conclusion conflicts with a note from that same section 8.3.4p1:

Notes in the Standard are not "normative" or authoritative; like coding examples, they are illustrative only. But I believe this note makes more clear the Standard's intent [3], which I summarize like this:

I don't know if this summary is technically correct, but it codifies my thinking [4]. In any event, the solution to your problem is simple: explicitly declare DTMF_9 as extern in both the .cpp and header files, thereby removing any ambiguity.

I end with an exercise for the student: what would have happened in Stan's original scenario had he translated his code as C?

Killer BSTRs

Q

I'm trying to learn how to write Microsoft COM servers, and have found what I think is a problem with one of Microsoft's type definitions. I know you don't like Microsoft-specific questions, but I hope this one is general enough to be worth your time.

My program (DLL) is implemented with VC++. But I want it to be called by a wide variety of languages, including both C++ and scripting languages like JavaScript. Microsoft has defined a set of common types supposedly available to all these different calling environments. If my DLL's COM interfaces use only these types, they can work right with all the different client languages.

Now my problem: one of these "safe" types is something Microsoft calls BSTR, which stands for "Basic STRing." BSTR is the universal string format for COM. It's really an array of 16-bit characters, along with an encoded string length.

Inside the VC++ headers, BSTR is defined as wchar_t *. This means I can't declare a real const BSTR, since const BSTR x just makes the pointer const — it does nothing to protect the actual string elements. So I'm stuck writing interfaces that can't promise to leave the pointed-to string alone, something I don't like at all. Do you know why BSTR is defined this way, or what I can easily do to get around this?

Sorry for the long email. Thanks in advance. — Hans Zarkov

A

I looked around in the Microsoft header files, and found the typedef you mention. Once all the Microsoft-specific macros expand, the definition is as you say:

typedef wchar_t *BSTR;

But wait, it gets even better, for Microsoft's C++ compiler (or at least the last one I used) does not treat wchar_t as a C++ keyword. It is actually typedefed as unsigned short, meaning BSTR is really

typedef unsigned short *BSTR;

Given this definition, you have no way to declare a BSTR such that the pointed-to elements are const. (There's also no way to overload on BSTR separate from the logically unrelated unsigned short *, but that's another matter.)

For inspiration I turn to another Microsoft string type, the lovely Hungarianized LPSTR. This typedef, which is aliased to char *, suffers the same problem as BSTR: you can't declare an object that protects the pointed-to elements. Microsoft apparently foresaw this, since they also define LPCSTR, which is actually char const *.

Unfortunately, if Microsoft also defines an LPCSTR-like BSTR, I sure can't find it in any of their headers. The simple solution to your problem, then, is to create your own

typedef wchar_t const *const_BSTR;

or the more Microsoft-sounding

typedef wchar_t const *CBSTR;

Caveat: I freely admit that the presence of const in a COM interface may cause some incompatibility with the COM model, or require extra translation-time and run-time overhead when the interface is called across different processes. I welcome Diligent Readers who really know this stuff to give me some insight, into both the lack of a true const BSTR, and the general implications of const in COM interfaces.

Static Cling

Q

According to the C++ Standard, what is the proper output in the following code:

class A
    {
public:
    int f()
        {
        static int firstVisit = 1;
        int returnVal = firstVisit;
        firstVisit = 0;
        return returnVal;
        }
    };

int main()
    {
    A  a1, a2;
    cout << a1.f() << " ";
    cout << a1.f() << " ";
    cout << a2.f() << endl;
    }

I would expect the output to be

1 0 1

but my compiler (Sparcworks CC3.0.1) gives

1 0 0

I would have thought that a static local variable of a non-static method would not be shared by all instances of the class, but that each instance would have its own copy of the variable.

Am I off base? Thanks, — Keith Hawkins

A

Sad to say, you've been picked off base and sent back to the dugout.

Just as static data members are shared among class object instances, static objects within function members are also shared. For the context of your question, the main difference is scope: static data members are visible to all function members, while the static local object is visible only to f.

Here's a way to think about this: if your conjecture about local static objects were true, the compiler would have to associate a unique instance of firstVisit with each instance of an A object. Where would that unique firstVisit instance live? Tucked away with A's data members, thereby increasing the size of an A object by sizeof(firstVisit)? Or in some external storage requiring a vtable-like lookup at run time?

Regardless of its implementation, firstVisit would effectively be an instanced data member visible only within f. This would violate the model for data scope and lifetime within C++, and deviate sharply from the established meaning of static inherited from C.

Conjunction Junction

Q

I have written a Win32 application that needs to make use of explicit DLL loads via functions LoadLibrary and GetProcAddress to access the Remote Access Services APIs. I need to define a "pointer to function" to receive the address returned by GetProcAddress. I'd like to maintain strong function prototyping, so I want to define my function pointer to match the calling characteristics of the DLL function whose address I am retrieving.

Since the function in question is a standard Win32 API there is a supplied header file with the function prototype. Thus, I can easily look at the header and duplicate the calling sequence in my definition. This is dangerous, though, since the API might change without my code knowing it, and worse, subsequent compiles would not reveal the mismatch.

I've been trying to figure out how to use the function prototype from the header in my pointer definition, but to no avail. The closest I have come is the code that follows:

#include <windows.h>
#include <ras.h>

DWORD (*pREE)(RasEnumEntries);

Under Borland C++ v5.02 the compile fails with the following error:

Error: Cannot convert 'unsigned long
    (__stdcall *)(char *,char *,
    tagRASENTRYNAMEA *,unsigned long *,
    unsigned long *)' to 'unsigned long *'

From the looks of this message I am close to what I want but not quite there. Do you know of a way to do this? Thanks for any help. — Tom Strickland

A

Get thee behind me Windows! Oh wait, this is another Standard wolf in Microsoft lamb's clothing. You're in luck.

Based on the compiler diagnostics you sent, I'm guessing RasEnumEntries is a function with this signature:

unsigned long __stdcall RasEnumEntries
    (char *, char *, tagRASENTRYNAMEA *,
    unsigned long *, unsigned long *);

If I'm right, the statement

DWORD (*pREE)(RasEnumEnntries);

would then be equivalent to

DWORD *pREE = RasEnumEntries;

That is, you are attempting to convert a pointer-to-function into a DWORD *, which is Microsoftese for unsigned long *. This matches the error message you are getting.

You can easily define a type to point to this function, using your original compiler diagnostic for guidance:

typedef unsigned long (__stdcall *Pudentain)
    (char *, char *, tagRASENTRYNAMEA *,
    unsigned long *, unsigned long *);

With Pudentain as the correct pointer-to-function type name, you can now declare

Pudentain my_RasEnumEntries(RasEnumEntries);

As for your fears about the API unexpectedly changing out from under you, relax. Microsoft is unlikely to change the signature of an existing API; about the worst they'd probably do is publish RasEnumEntriesEx or some such, with extra parameters.

Even if they did change the API, the data type Pudentain can reference only functions that have the existing function's signature. If some other nefarious agent tries to change the signature of RasEnumEntries,

Pudentain my_RasEnumEntries(RasEnumEntries);

will fail at compile time — you'll get a message about a type conversion error, much like the diagnostic in your original email.

Decl Heckled

Q

Hi! When answering Harald Nowak [5] you didn't mention his misconception that in

for (int i; ...; ...)

the variable i is only living inside the for block. I believe — and MSVC 6.0 agrees — that int i is still available after the for block. This means that the following code would be invalid:

for(int i = 0; i < 10; ++i)
    ;
for(int i = 0; i < 10; ++i) // i redefined
    ;

Because of this, the for statement can be written as

int i;
for(i = 0; i < 10; ++i)
    ;

and there would be no change in any object lifetime.

With regards, — Werner Henze

A

Another one? This must be Microsoft Mania Month. Or maybe I'm auditioning for Microsoft Systems Journal and don't know it.

Werner, tempting as it is, we cannot define Standard-conforming language behavior by what translators actually do. The truth is the other way around: the behavior comes first as expressed in the Standard, followed by language vendors writing their translators to conform to that expression.

Or so the theory goes. While I feel safe saying major translator vendors get most conforming features right, no one of them gets everything right. I've found that Microsoft's compilers in particular sometimes fail to support basic language features years after their adoption into the Standard.

In the case you cite, MSVC's behavior does not conform to the C++ Standard, as shown by section 6.5.3p3 ("The for statement"):

If the for-init-statement is a declaration, the scope of the name(s) declared extends to the end of the for-statement.

int i = 42;
int a[10];
for (int i = 0; i < 10; i++)
    a[i] = i;
int j = i; // j = 42

As both the text and the code example reinforce, the i declared within the initialization part of the for statement is local to that for statement. So what I told Harald was correct.

Several readers pointed out my "error" on this one; all of them were using Microsoft's compiler as "proof." I don't know why Microsoft doesn't support statement-local scoping like this. I surmise it would break too much of their existing code base, but that's just an educated guess [6].

Notes

[1] C++ Standard, section 3.5p3 ("Program and linkage").

[2] Same section, paragraph 4.

[3] I find this note overly broad. It implies, for example, that volatile global arrays have internal linkage, which doesn't make sense to me. Also, since the note makes no mention of scope, it implies that a local const array has internal linkage. I simply cannot believe this is the Committee's intent.

[4] Special thanks to Dan Saks for helping me work through my thinking here.

[5] "Uncaught Exceptions" item "Heckle and Decl" in the March 1999 CUJ.

[6] While I haven't heard them use this phrase for a while, Microsoft denizens used to say that they "ate their own dog food," meaning they used their own tools and software internally. On every project I've ever done for them, I've used Microsoft's own compilers. Put another way, Microsoft itself is one of Microsoft's biggest compiler customers. If too much of their own code relies on non-conforming compiler behavior, I can well imagine their reluctance to change the compiler.

Bobby Schmidt is a freelance writer, teacher, consultant, and programmer. He is also an alumnus of Microsoft, a speaker at the Software Development and Embedded Systems Conferences, and an original "associate" of (Dan) Saks & Associates. In other career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via rschmidt@netcom.com.