March 1999/Uncaught Exceptions

C/C++ Contributing Editors

Uncaught Exceptions: Singletons, Five-Liners, and Multiple Inclusion

Bobby Schmidt

Bobby handles a series of small problems, including reentry at Microsoft after lobbing a few brickbats their way.

Copyright © 1999 Robert H. Schmidt

To ask Bobby a question about C or C++, send email to cujqa@mfi.com, use subject line: Questions and Answers, or write to Bobby Schmidt, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.

Last month I ragged a bit on Microsoft and ran a question from a Microsoft reader. A couple of weeks later, small world syndrome struck: after a two-year hiatus, I've once again taken on Microsoft as a client. Part of my mission this time is recommending improvements in documentation and coding style for one particular project team — and as Diligent Readers know, I suffer no shortage of opinions and ideas about the Mighty M's methods.

I am somewhat dazzled by the cornucopia of new technology — and new technology acronyms. Microsoft's success seems to be built on constant reinvention and rampant user consumption; but I have to wonder how much is new, and how much is really old stuff taken to Earl Scheib. First-week impressions:

ATL is to COM what MFC is to the Windows API, an attempt to wrap the complex within the inscrutable.

The bulk of file and registry detritus deposited by MSVC and Office remains astonishing.

Windows 98 drives like the Windows 95 Brougham edition.

Windows NT 4 drives like a Windows 95 Humvee.

I still have to install everything twice on a dual-boot system, just to get the registry entries right. Will we ever have a harmonic registry convergence?

Most significantly, I can finally watch MSNBC. In a supreme irony, the TCI cable TV service where I live (just a few miles from Microsoft's main campus) doesn't carry the Microsoft cable channel. The only way I see it is in the lobbies of select Microsoft buildings, where the TV plays it all day like some Orwellian telescreen.

There Can Be Only One!

Q

How can I have a class which allows only one object to be created? Thanks — Kirankumar Goli

A

Several techniques leap to mind. If you want to allow multiple objects to be created and destroyed — so long as no more than one object exists at a time — consider a reference counting scheme like
class X
    {
public:
    X()
        {
        if (count_ > 0)
            throw someException;
        ++count_;
        }
    ~X()
        {
        --count_;
        }
private:
    static int count_;
    };
     
int X::count_(0);
Once the X constructor is called for the first object, the count is set to one. Subsequent constructions fail, unless and until the count drops back to zero via the class destructor. With this method, at most one object can exist at a time. (You could use a bool instead of an int here, since you need to track only two different states.)

Instead of constructing and destroying multiple X objects (one at a time), you might instead want to ensure that only one X object can ever be constructed:
class X
    {
public:
    X()
        {
        static int count(0);
        if (count > 0)
            throw someException;
        ++count;
        }
    ~X()
        {
        }
    };
With this scheme, the reference count can never decrease, so a second object will never be constructed.

While simple, these techniques are deficient:

The constructor can throw an exception which the caller must be prepared to catch.

Even if you intend to create only a single X object, nothing prevents you from accidentally trying to construct others.

To avoid these problems, hide the constructor and use a simple object broker:
class X
    {
public:
    static X &object()
        {
        static X x;
        return x;
        }
private:
    X();
    ~X();
    };

X a; // error, can't access default ctor
X *b = new X; // error, same reason 
X c = X::object(); // error, no copy ctor
X &d = X::object(); // OK
X &e = X::object(); // OK, yields same object
Now there can be only one instance of X, controlled by the function X::object. (I'm taking my inspiration here from both the so-called Singleton pattern and Microsoft's COM [1].)

Five Easy Pieces

Q

Regarding the small functions you present in your September 1998 article: I think that small, generic functions/templates (I call them "five-liners" because their average size is five lines) are a potentially very useful aid in the C++ programmer's toolchest.

You should build incrementally in your columns a "five-liners library." Function lengthof can be a good starting point. Here is one that is useful in creating error messages and nontrivial strings:
template <class T>
std::string to_string(const T &a)
{
    char Buffer[2048];
    {
        std::ostrstream Stream
            (
            Buffer,
            sizeof Buffer
            );
        Stream << a << '\0';
    }
    return Buffer;
}
This transforms objects quickly into strings. It requires the operator << to be defined in the classic way for ostreams and objects. The good part is that << is already implemented for primitive types, so you get something for nothing.

I should have templatized the function for dealing with wide strings, but my compiler is quite awkward.

Yours — Andrei Alexandrescu

A

I like the idea of collecting such five liners. Now that you've seeded the process, perhaps other Diligent Readers will contribute more.

Regarding to_string, the only problems I see are

Reliance on a magic-sized buffer — we can't guarantee that 2,048 will always be big enough.

The limitation that T must be a type for which operator<< is defined.

Construction and destruction of the local ostrstream object, especially if to_string is called a lot.

In practice, these are nits. I've used similar to-string functions to great effect in my own projects for years.

Heckle and Decl

Q

I recently found a subtle C++ syntax problem in declaring variables in control structures (according to minimal scoping rules).

Assume I have two variables of different types, which I want to scope only within the block of the for control structures. I tried to write:
for(
   int i = 0, double val = 0;
   i<iMax;
   i++, val += 1
   )
{
   // Whatever
}
which of course won't work, since the comma is interpreted not as a sequence operator, but as a delimiter in the declarator portion of the expression. Another declaration specifier like double cannot follow.

A naive change of mine led to the following code, that would compile with Borland C++ 5.0, but nonetheless was wrong:
double val;
for(
   int i = 0, val=0;
   i < iMax;
   i++, val += 1
   )
{
   // Whatever
}
Now, val is interpreted to be of type int and hides the outer val for obvious reasons. Other compilers like Visual Age C++ complain about the redeclaration of val; but in any event, this code is wrong and is not what I intended. So the only working solution seems to be the following:
double val = 0.0;
for(
   int i = 0;
   i < iMax;
   i++, val += 1
   )
{
   // Whatever
}
With this solution, the code works as expected, but I have to give up the minimal scoping principle — val lives much longer than necessary.

So now my question goes: is there any reasonable way of declaring both i and val, so that their scope is only the block of the control structure?!

I find that this is quite a serious hole in the language, since constructs which need more than one loop variable are quite often necessary. This is especially true with STL containers, where there is often the need to have both an iterator and an index.

Regards — Harald Nowak

A

Yours is a most interesting problem, one I've not come up against before.

For those wondering what Harald's on about, remove the declaration from the for statement, place it at global scope:
int i = 0, double val = 0;
and compile as either C or C++. Your compiler should balk at double, complaining that it expects an identifier (double is a keyword). As Harald writes, removing the "offending" keyword:
int i = 0, val = 0;
has the result of defining both i and val as int.

The real solution is to break the declaration into two statements:
int i = 0; double val = 0;
Unfortunately you can't apply this solution to the original for statement:
for (
    int i = 0; double val = 0;
    i < iMax;
    i++, val += 1
    )
since the compiler wants to see a ) after i < iMax.

Why does the compiler behave this way? Sifting through the C++ Standard, I find that the grammar of a for statement is
  for ( for-init-statement condition_opt ;
        expression_opt ) statement
where for-init-statement is defined as either an expression-statement or a simple-declaration. Thus, we can have a for statement that reduces down to
  for ( simple-declaration ; ) statement
Finally, simple-declaration is defined as
  decl-specifier-seq_opt init-declarator-list_opt ;
(Fans of Mr. Saks will recognize our old friends the Decl twins, Spec and Arator.) The bottom line: only one declaration statement can fit within the () of a for statement.

My recommendation is to wrap the for statement in an extra block layer:
{
int i = 0;
double val = 0;
for (; i < iMax; i++, val += 1)
    {
    /* ... */
    }
}
/* i and val are now gone */
As long as you sneak no extra statements before or after the for loop, i and val are effectively constrained to the scope you want. In addition to fixing your multiple-declaration problem, this solution has the virtue of working in C, and with C++ compilers not supporting for-scope declarations.

One Small STSTEP

Q

Hello Bobby,

I am still confused by the way programmers define macros that prevent multiple header-file inclusion:
// #1
//
#ifndef _STSTEP_H_
#define _STSTEP_H_
// ...
#endif // !_STSTEP_H_
     
// #2
//
#ifndef __STSTEP_H
#define __STSTEP_H
// ...
#endif // !__STSTEP_H
     
// #3
//
#ifndef STSTEPH
#define STSTEPH
// ...
#endif // !STSTEPH
Which is the best style? Why do most programmers use single/double underscores in their header-file macro names?

Thanks — Brian Ho

A

I don't know that there is a "best" style. I can tell you the style I use (and why), then discuss an extension of that style.

Of your three examples, only #3 is safe. #1 and #2 use identifier forms reserved for the translator implementation [2]. If indeed "most" programmers use these two forms, I'm guessing they don't know that the Standard reserves such names for other uses — or wouldn't care if they did know.

I'm assuming the header file being protected here is called ststep.h. In that case, my current style is
#if !defined INC_STSTEP_
    #define  INC_STSTEP_
// ...
#endif // !defined INC_STSTEP_
Rationale:

Following the lead of C++ Standard library headers, I don't assume all header names end in .h. I therefore tend to leave the H or _H suffix off of these macro names.

The INC_ prefix implies the protection is based on file inclusion (as opposed to other kinds of protection mentioned below).

#if !defined is more flexible than #ifndef, so I promote its use wherever I can. I also like the symmetry between the opening #if !defined and the closing #endif // !defined. About the closest you can come to similarly match #ifndef is #endif // ndef.

I indent the #define statement and add a space after, so INC_STSTEP_ lines up beneath the same name in the #if statement. This lets me easily ensure the macro names match.

INC_STSTEP_ is only a partial solution. To see why, assume ststep.h contains
typedef int HANDLE;
This is the sort of definition you're likely to use throughout a project. In fact, other project-standard header files may want to make use of this same definition — but at the same time, you may have good reasons not to want large headers including one another. In such an environment, you may decide to replicate this definition into multiple headers.

In that case, assume HANDLE's definition appears in both ststep.h and some other massive project header called, oh, let's say windows.h. If you compile code including both headers:
#include "ststep.h"
#include "windows.h"
you'll get an error about HANDLE being multiply defined. Clearly header-file wrappers are not enough.

One solution: instead of just wrapping the headers, also wrap the individual definitions contained in those headers:
#if !defined INC_STSTEP_
    #define  INC_STSTEP_
     
#if !defined DEF_HANDLE_
    #define  DEF_HANDLE_
    typedef int HANDLE;
#endif
     
// ...
     
#endif // !defined INC_STSTEP_
(I use the prefix DEF_ to imply I'm defining a single entity instead of including an entire header file.)

Use this same DEF_HANDLE_ wrapper in windows.h. Now HANDLE's definition appears only once, regardless of how many headers you include, or in what order you include them.

Another solution: define an auxiliary header handle.h that contains the HANDLE definition, then include that file in ststep.h and windows.h. This makes for more file juggling and file-name collision, but also makes maintenance easier, since HANDLE is defined in exactly one place.

Lest you think this problem is far-fetched, consider the Standard C and C++ type size_t, which is defined in multiple headers. Among those headers are the Standard C library headers stddef.h and stdlib.h, yet you can compile
#include <stddef.h>
#include <stdlib.h>
without error. And like the hypothetical example above, the Standard C headers don't include one another.

If you crack open your compiler's stddef.h and stdlib.h, you will probably see a variation of these solutions in use. As an example, Microsoft Visual C++ uses the former method while Metrowerks CodeWarrior Pro uses the latter.

Erratica

Last month I wrote (in response to Andrew Phillips' critique):

The 64K portable limit for a single C++ array is my extrapolation from the C Standard. I expect that some conforming C or C++ implementations — especially for embedded systems — are limited to single arrays/vectors of size 64K

That response is misleading on two counts:

The 64Kb limit is for C9X; C89's limit is 32K. When I wrote about extrapolating from "the C Standard," I really meant "the Draft C9X Standard."

In either C89 or C9X, the limit applies to hosted environments only. Since embedded systems are typically not considered hosted environments, the Standards apparently make no guarantees about object sizes on those systems.

The morbidly curious can find these limits described in section 2.2.4.1 of the C89 Standard and section 5.2.4.1 of the Draft C9X Standard.

Our final thought is courtesy of Mr. Ubiquity down in Oregon:

As for how my CD is coming along, you and your readers can see for yourselves how I (and several others) blew 1998: developing the Effective C++ CD. A complete demo is available at http://meyerscd.awl.com/. Though the bleeding edge of HTML technology is a messy place (we never will get those stains out of the carpet), I hope you and your readers will agree that the CD breaks new ground in several areas of electronic publication. I encourage everybody to give it a look-see and let me know what they think: smeyers@aristeia.com. o

Notes

[1] After writing this answer, I discovered that Scott Meyers covers similar territory in Item 26 of his More Effective C++.

[2] Briefly, C++ identifiers starting with an underscore and an upper-case letter, or containing double underscores anywhere, are reserved. The rules are slightly different in C. Check out my October 1998 CUJ column for more information.

Bobby Schmidt is a freelance writer, teacher, consultant, and programmer. He is also an alumnus of Microsoft, a speaker at the Software Development and Embedded Systems Conferences, and an original "associate" of (Dan) Saks & Associates. In other career incarnations, Bobby has been a pool hall operator, radio DJ, private investigator, and astronomer. You may summon him on the Internet via rschmidt@netcom.com.