March 1997/Questions & Answers

Columns

Questions & Answers

Pete Becker

Wrapping the Prickly Pragma

Pete answers questions on subtle issues from pragmas to sharing inline functions to the one definition rule in C++.

To ask Pete a question about C or C++, send e-mail to pbecker@wpo.borland.com, use subject line: Questions and Answers, or write to Pete Becker, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.

Q
My question concerns the use of the preprocessor in C/C++. How can I use #pragma and other preprocessor directives (#asm, #if, and so on) inside a macro definition? I would like to define a macro like the following:
#define MYMACRO do {  \
#pragma option -A     \
    somecode();           \
    somethingelse();      \
} while (0)
but my compiler (Borland C++ 3.0) prints error messages like:
"Illegal character '#' (0x23)"
"undefined symbol 'pragma'"
I also tried to put a backslash in front of #pragma (\#pragma) but I got the same error messages. Thank you in advance for your answer.
— Carlo Amaglio

A
I've occasionally needed to do the same sort of thing. It's not as straightforward as we'd like it to be. The problem is that both the C and the C++ language definitions say that a macro cannot expand into a preprocessor directive. That makes your example code invalid as C or C++, because it tries to put a preprocessor directive inside the macro MYMACRO. I don't know of a way to get the effect of having a #pragma directive inside a macro.

On the other hand, it's fairly easy to simulate having an #if directive inside a macro. Just invert the logic: put the #if around the macro definition.

For example, suppose you want to write a macro that logs debugging checkpoints by executing a printf statement if the symbol DEBUG is defined, and does nothing if it is not defined. Your first attempt might look like this:
#define DBG(n)  \
#ifdef DEBUG    \
printf( "Debug: %d\n", n );  \
#endif
Your compiler should complain: you can't use #ifdef inside a macro. But you can do it the other way around, and put the macro definition inside an #ifdef directive:
#ifdef DEBUG
    #define DBG(n)  \
    printf( "Debug: %d\n", n );
#else
    #define DBG(n)
#endif
This form is a bit more verbose, but it has the advantage of being valid C.

If you're building more complex macros, you might consider using additional macros for some of the parts, and then using those macros to build your bigger macro.
#if DEBUG > 0
    #define ERR error_count++
#else
    #define ERR
#endif

#if DEBUG > 1
    #define DBG_SHOW(n)  \
    printf( "Debug: %d\n", n );
#else
    #define DBG_SHOW(n)
#endif

#define DBG(n)  \
ERR;            \
DBG_SHOW(n)
The above makes it easy to change the behavior of ERR and of DBG_SHOW, but comes at the cost of making less clear what DBG is actually doing. The other way of writing this macro makes DBG's functionality clear, but makes it a bit harder to maintain:
#if DEBUG > 1
    #define DBG(n)  \
    error_count++;       \
    printf( "Debug: %d\n", n )
#elif DEBUG > 0
    #define DBG(n)  \
    error_count++;
#else
    #define DBG(n)
#endif
For a macro of this size I prefer the second version. I find it very confusing to trace through multiple levels of macros, and I'm willing to do a bit more work in maintaining longer macros in order to avoid those extra levels. At some point, though, the complexity of the big macro makes the multi-level technique worthwhile.

Incidentally, the form of indentation I've used is valid ANSI C, but some older compilers won't accept it. ANSI C says that any line whose first non-blank character is a # is a preprocessor directive. Some older compilers recognize # only when it is the first character in the line. In that case you may need to change the indentation style. In most cases this form will work:
#if DEBUG > 0
# define ERR error_count++
#else
# define ERR
#endif
I prefer the former, but only slightly. If you're concerned about maximizing the portability of your code to non-ANSI compilers you should use this latter form.

Q
I'm starting to see several references to an addition to the proposed C++ standard to allow inline functions to be declared extern:
extern inline SomeFunc();
Assuming the actual function definition is in FILE1.CPP, and a call to it is made in FILE2.CPP, what will be the result? Will the compiler be required to somehow generate inline code for the function in FILE2.CPP? (I don't know how that would be possible without implementing something like the Ada dictionary.) Or will a compiler make a standard call to the function in FILE2.CPP, with FILE1.CPP supplying a standard non-inlined implementation of the function which can be called? If the latter (which I would expect) then the code for the function would actually only be inlined where it is called within FILE1.CPP. Is this correct?
— Earl Bennett

A
Good question, thanks. In fact it's always been valid in C++ to declare an inline function extern.What has actually changed in the current working paper is that now inline functions are by default extern instead of static. Now, that sounds like it will lead to major changes in how you use inline functions, as you've suggested in your question, but my guess is that it won't require any changes to code that you've already written.

In fact, you already know how to write code properly under this rule. Inline member functions have always been extern; this change simply applies the same rule to non-member inline functions, and all you need to do is apply the techniques that you've used all along with inline member functions to non-member inline functions. The simplest approach is to define all of your non-member inline functions in appropriate header files, and simply #include the right header when you need the inline functions that it defines. If that's the way you've been using inline functions then your code will continue to work just fine under the new rule.

To understand why this works, and what other techniques also work, we must understand the notion of a translation unit, the meaning of linkage, and the one definition rule.

When we talk about compiling we usually say that we compile a source file. The C++ language definition (and the C language definition, as well) uses a more precise term: the compiler compiles a "translation unit." A translation unit is the result of running a source file through the preprocessor, which pulls in information specified in #include directives [1] and expands macros. Once that's been done the compiler tries to interpret the translation unit in accordance with the syntax and semantics of the C++ language. If you've written a source file that produces valid code when it is preprocessed the translation succeeds. If not you get an error message [2] .

Every identifier used within a translation unit has a property known as "linkage." The two flavors of linkage that we're concerned with here are known as internal and external [3]. External linkage is defined negatively: any object that doesn't have internal linkage has external linkage. Internal linkage is defined by the various C++ constructs that produce it. For example, declaring a data object to be static gives it internal linkage; defining a const object that is not declared extern gives it internal linkage. It used to be the case that any inline function not declared extern had internal linkage. That's the rule that's been changed. Now the rule says that any inline function that is declared static has internal linkage. So an inline function with neither a static nor an extern qualifier used to have internal linkage, but now has external linkage. For example:
static inline int f()  // declared static, so
{ return 1; }          // has internal linkage

extern inline int g()  // declared extern, so
{ return 1; }          // has external linkage

inline int h()         // neither static nor
{ return 1; }          // extern. Old rule:
                       // internal linkage. New
                       // rule: external linkage.
Here's where you must be a bit careful. Most C and C++ programmers think of external linkage as meaning that there's one object with that name in one translation unit, and that other translation units use that name to refer to that single object. For data objects and for ordinary functions that analysis works just fine:
// test1.cpp
void f()
{
}

int n;

// test2.cpp
void f();      // refers to 'f' in test1.cpp
extern int n;  // refers to 'n' in test1.cpp
However, data objects and ordinary functions aren't the only things whose names have linkage. Classes do as well, for example.
// class1.cpp
class C
{
public:
void f();
};

// class2.cpp
class C
{
public:
void f();
};
In each of these translation units, the class name 'C' has external linkage. If this code looks a little odd to you, remember what a translation unit is: it's the source code after it's been preprocessed. In fact, I'll bet that just about every C++ program that you've ever written does this sort of thing, although not quite so directly. You've probably done it more like this:
// c.h
class C
{
public:
void f();
};

// include1.cpp
#include "c.h"

// include2.cpp
#include "c.h"
After preprocessing, these files look just like the ones in the previous example. That is, each of the translation units contains a definition of the class C. Now, we know that we can use both of these source files in a single program without the compiler complaining, so the meaning of external linkage can't be quite as simple here as it is for functions and for data objects. In fact, the true meaning of external linkage is that objects with external linkage are subject to the one definition rule.

In the case of ordinary functions and data objects, the one definition rule means simply that there must be only one definition of the object in the program. We all know how to make sure that that happens: as in the earlier example, we write definitions in one place and extern references to those definitions in other places. In the case of class definitions and inline functions, the one definition rule means that if more than one translation unit has a definition for that object the definitions must be the same [4] . In practice, as I mentioned at the beginning of this article, we write definitions in a single header file and #include that header wherever we need it. That's not technically necessary, though. It's perfectly valid to cut and paste the definition of a class or an inline function from one source file to another. If you do that you have a more difficult maintenance task, though; you have to be sure that any changes you make to one copy of the definition get copied to all the others. Otherwise your program probably violates the one definition rule.

Let's look at a few possibilities. Suppose each of these source files is part of a single program.
// inline.h
inline int f()
{ return 1; }

// unit1.cpp
#include "inline.h"

inline int g()
{ return 1; }

inline int h()
{ return 1; }

// unit2.cpp
#include "inline.h"  // Ok; same definition of
                     // 'f' as in other units

inline int g()       // Ok; same definition of
{ return 1; }        // 'g' as in other units

// unit3.cpp
inline int f()       // Ok; same definition of
{ return 1; }        // 'f' as in other units

static inline int h()  // Ok; different definition
{ return 2; }          // of 'h', but with internal
                       // linkage
Now suppose we add a fourth file:
// unit4.cpp
inline int g()  // one definition rule
{ return 2; }   // violation; different
                // definition of 'g',
                // external linkage
As I said earlier, it's perfectly alright for several source files to each contain their own definition of an inline function with external linkage, so long as the definitions are all the same. Don't do that, though. You're asking for trouble. The compiler is not required to diagnose violations of the one definition rule. If you don't keep all versions of these functions the same, the compiler is free to compile and link your code anyway, and not tell you that anything is wrong. Your program will not work the way you expect it to, and you'll probably have to resort to the debugger to figure out what's wrong. You'll save yourself a lot of aggravation if you adopt the discipline of putting all of these things into header files in the first place.

Q
I think I've stumbled on an obscure rule on class inheritance that I cannot explain. Consider the following program:
#include <stdio.h>
#include <stdarg.h>
#include <fstream.h>
class foo
{
public:
void PutLine (const char *str);
};

class bar : public foo
{
public:
void PutLine();
};

int main()
{
bar list;
list.PutLine ("Hello World");
}
I would expect this program to compile and write the familiar "Hello World" message on the user screen. However, in this instance, the compiler complains that there is an "Extra parameter in call to bar::PutLine()". Since foo::PutLine(const char *) is declared as a public member function, and the class foo is inherited publicly, I would expect the compiler to find the function. Can you shed some light?
Thanks,
Adam Clark

A
This is a problem people often run into when they are first experimenting with inheritance and overloading. The rule in C++ is that overloading takes place only among functions defined in the same scope. The declaration of foo creates a scope containing foo::PutLine and the declaration of bar creates a separate scope containing bar::PutLine. The two versions of PutLine are defined in different scopes and do not overload.

This may seem like a rather awkward rule, but it serves an important purpose: it insulates your code from changes in base classes. For example, suppose you used a library that provided a class named base, and you wrote a class named derived that was derived from base:
#include "base.h"

class derived : public base
{
public:
void f(double);
};
Then you used class derived in a program:
int main()
{
derived d;
d.f(1);
return 0;
}
The compiler would resolve the function call in d.f(1) to derived::f, converting the 1 to a double. Now suppose you've just received an updated version of the library, and they've added a function void base::f(int) to class base. If this new version of f took part in overloading in your derived class, when you recompiled your program to use the new version of the library you'd find that d.f(1) called base::f instead of derived::f. Since the writers of the library almost certainly don't know anything about your class derived, chances are very good that base::f doesn't do anything like what you need it to do for your program to continue working correctly. By extending the interface of class base, the library writers would have caused a quiet change in your program. That's dangerous, and that's why overloading does not cross scopes.

If you want to do this sort of thing, you can do it explicitly in your code. If we go back to your original example, the most general way to make foo::PutLine available in bar is to add a member function to bar that calls foo::PutLine:
class bar : public foo
{
public:
void PutLine();
void PutLine(const char *str)
    {
    foo::PutLine(str);
    }
};
Since both versions of PutLine are defined in bar, the call list.PutLine ("Hello World") will be resolved by calling bar::PutLine(const char *), which in turn will call foo::PutLine(const char *). The new version of PutLine is known as a "forwarding function"; its job is simply to forward its arguments to foo::PutLine. If there are several versions of PutLine in foo you must write a forwarding function for each one.

If your compiler fully supports namespaces you can accomplish the same thing with less work, by adding a using declaration to your derived class:
class bar : public foo
{
public:
using foo::PutLine;
void PutLine();
};
The using declaration tells the compiler to insert the name and signatures of all versions of foo::PutLine into bar, so the compiler considers all versions of foo::PutLine, along with all versions of bar::PutLine, when it decides which function to call. With this change, your original code will compile, and the call list.PutLine ("Hello World") will call foo::PutLine.

Q
I just finished reading your column in the October 1996 issue of the C/C++ User's Journal, and I have a comment regarding your response to Erik M. Lowndes' question.

To provide some context, Mr. Lowndes asks how to properly use C++ member functions as function pointers, particularly regarding registering callback routines for libraries.

As any C++ programmer should know, static member functions do not have a this pointer. By simply making my global function static instead, I was able to maintain my encapsulation and protection, use the pseudo-this pointer (if necessary — your error-handler example probably wouldn't need such hassle and could maintain its state in static variables) to access any member data I needed, and still happily use the C libraries.

Perhaps you left this method out due to space restrictions, perhaps it never occurred to you, or perhaps you have some reason why it would be inconvenient or incorrect to use. If the latter is the to blame, I would certainly like to know your reasoning, as it would be most embarrassing to see my carefully-designed programs fail due to something I overlooked!
— Robert M. Chadwick

A
Several people wrote to me about this technique. It's a common practice, but it's not technically legal. I don't know of a compiler that doesn't support it, so as a practical matter you can probably get away with it, but do so with the knowledge that your program is in a state of sin. The C++ language definition does not require that compilers make this technique work.

The essence of the problem is that you are passing a function pointer to an operating system or a framework such as X Windows that was written with C in mind. That is, the system knows how to call C functions. The only way to guarantee that a C++ function can be called from C is to declare it extern "C". In the absence of an extern "C" declaration the compiler is free to generate code for the function in a way that is completely incompatible with C. If you look back at my earlier column you'll see that the global function that I used was declared extern "C". You can't do that with a member function, even if it's static. Since you can't put extern "C" on your static member function, you can't guarantee that it can be called from C, which means that you also can't guarantee that it can be called by the operating system.

Notes

[1] It's awfully tempting to say that the preprocessor pulls in the text of a file named in a #include directive, but that's not strictly accurate. It's true when the #include directive uses quotes around the name: #include "myfile.h" acts just as if he contents of the file "myfile.h" had been part of he original source file at the point where the #include directive occurred. However, if the name is in angle brackets, such as #include <iostream>, it's up to the compiler implementer whether the compiler will look for an actual file somewhere with the appropriate text. It's perfectly valid for the compiler to magically know the contents of these headers, which can greatly speed up compilation of system headers.

[2] Note that we're talking here about how the language definition describes the compilation process. Under the "as if" rule, an actual compiler is not required to run a separate preprocessing step. A compiler is allowed to merge preprocessing, syntactic analysis, and semantic analysis into a single step if the result is the same as it would have been if the rules in the language definition had been followed.

[3] The third is "none."

[4] Trust your intuition on what "the same" means. There's a more formal definition of the one definition rule in the C++ working paper. This took a lot of work and a lot of argument, but we think we've got it right. The formal definition has to be able deal with odd cases like this:
	// odd1.cpp
	#define D C
	#define global public
	class D
	{
	global:
	D();
	};
	
	// odd2.cpp
	class C
	{
	public:
	C();
	};
Clearly, these two files really define the same class, and the formal language of the one definition rule must become a bit complicated to be able to say that. On the other hand, out in the real world we have a much easier solution to confusing code like this. We can fire the programmer who wrote it. o

Pete Becker is director of development at Borland International's Open Environment Division in Boston, MA. He has been actively involved in C++ development work at Borland for seven years, both as a developer and a manager. He is Borland's representative to the ANSI/ISO C++ standardization committee. He can be reached by e-mail at pbecker@oec.com.