Dan Saks is the owner of Saks & Associates, which offers consulting and training in C, C++ and Pascal. He is also a contributing editor of TECH Specialist. He serves as secretary of the ANSI C++ committee and is a member of the ANSI C committee. Readers can write to him at 287 W. McCreight Ave., Springfield, OH 45504 or by email at dsaks@wittenberg. edu.
There are many good reasons to switch from C to C++. In addition to the full functionality of C, C++ offers stricter compile-time checking, type-safe linkage, inline expansion of functions, reference types, overloaded function names, and overloaded operators. And of course, C++ provides language-level support for encapsulation (data abstraction) and object-oriented programming (OOP).
An unfortunate consequence of the current hype surrounding C++ and OOP is that some people believe that using C++ for anything other than OOP (employing both inheritance and polymorphism) misuses the language. I strongly disagree. OOP is very useful in some applications, but not all. C++ adapts well to different styles of programming, and OOP is just one of those styles. The language has much to offer, even if you're not ready for OOP.
Still, there are some practical reasons for not using C++. C++ compilers are not as widely available as C compilers. Even if you can find C++ compilers for your target environment(s), you might not find the necessary support tools, such as symbolic debuggers and application libraries. A formal standard for C++ is still years away, and there isn't even a full implementation of the current de facto standard [1]. I have no doubt that the obstacles to C++ will eventually disappear, but that doesn't help you today.
Even if you're not ready to move to C++, you can pave the migration path by writing your C code so that it also compiles as C++. C++ is an improper superset of Standard C; only a short list of C constructs won't compile as C++. Fortunately, you can easily rewrite every C construct that is not C++ as C that is also C++. (Note: Unless I explicitly write "Classic C", "C" means Standard C.)
Tom Plum calls the common subset of C and C++ "C-". (Don't confuse this with "C--", Jim Gimpel's pejorative for C++ as a "strongly hyped" language [2].) Converting a program from C to C- is the first step in converting to C++. You might as well plan ahead and write all your C as C- by observing the following guidelines.
Function Declarations
Classic C has fairly relaxed rules for data type conversions. Whereas Standard C encourages you to get your data types right (and often warns you when you don't), C++ is much more insistent. C++ compilers frequently issue errors for transgressions that only merit a warning in C. C++ is particularly finicky about function declarations.Listing 1 shows a simple Classic C program in the lax style of the first edition of K&R [3]. The program compiles as Standard C, but not as C++. C++ has at least two complaints about Listing 1: (1) greet is called before it is declared, and (2) printf is not declared at all. On most C++ compilers, both error messages say something about a missing prototype. Declaring printf is easy; just add
#include <stdio.h>at the top of the source file. Declaring greet is a little more complicated and requires some decisions.If you move the definition for greet above main, most C++ compilers will accept the definition as a declaration, although they will issue a warning that the definition's style is obsolete. Rewrite the old-style heading
greet (s) char *s; { ...as the prototyped heading
greet(char *s) { ...to silence the warning. With a little extra thought, you might recognize opportunities to add appropriate const qualifiers to the parameters, such as
greet(const char *s)Once greet is declared before it's used, the compiler notices that greet has a default return type of int, but no return expression. You can add an explicit void return type to the function heading, or add a return statement with a integral expression. In this case, the void return type is more appropriate.Notice that main also has a default return type of int, and no return expression. Curiously, most C++ compilers treat a missing return expression as an error, but don't even issue a warning for a missing return expression in main. I hope future compilers will issue at least a warning. In any event, I recommend explicitly declaring main as a function returning int with a normal return value of EXIT_SUCCESS (from <stdlib. h>).
As an alternative to moving greet, you can insert the prototype declaration
void greet(const char *);above main. However, you must still rewrite the function definition as a prototype to turn off the compiler's warning for using an obsolete function heading. Converting a C program to C-by inserting prototypes is probably easier than moving entire function definitions, but I prefer to move the functions to eliminate unnecessary prototypes. (See Listing 2. )In C, the function declaration
int f( );means that f's parameter list is completely unspecified f can have any number of arguments of any type. To declare f with no arguments, write
int f(void);In C++, both declarations for f mean that f accepts no arguments. To avoid confusion in C-, specify empty parameter lists explicitly using void.Function prototypes pose a real problem if you want your header and source files to compile as both Classic C and C++. Whereas Standard C supports prototypes but doesn't require them, Classic C doesn't support prototypes and C++ requires them. If you're still struggling with Classic C compilers, then writing in C- may be more trouble than it's worth. Nevertheless, you can use the preprocessor to write code that works with both Classic C and C++. See the sidebar, "Turning Prototypes On and Off" for details.
Pointer Conversions
In Standard C, assignment between different (non-void) pointer types is non-portable and produces a diagnostic. Most C compilers generate a warning against assignments like
char *pc; int *pi; ... pc = pi;but compile the code nonetheless. C++ treats the assignment as an error. It is permitted only with an explicit cast:
pc = (char *)pi;When programming in C-, don't ignore the warnings or turn them off. Rewrite the code to use compatible pointers or insert a cast.Standard C allows assignment of any pointer type both to and from void pointers. This allows C programmers to write functions like
void free(void *p);that quietly accept any pointer type. It also permits the return value of functions like
void *malloc(size_t size);to be copied to any pointer type without a cast. C++ considers assignment of void pointers to non-void pointers as a "hole" in the type safety, and permits it only with an explicit cast. Thus, for example, you must write the assignment in
t *p; ... p = malloc(sizeof(t));as
p = (t *)malloc(sizeof(t));to be valid in C-.Listing 3 shows an implementation of the Standard C function memcpy written in C-. C does not require the casts in the initializations of t1 and t2, but C++ does.
C++ allows assignment of a non-void pointer to a void pointer, but only if the non-void pointer points to an object that isn't const or volatile. That is, given
const char name[ ] = "Dan"; void *p;then neither
p = name;nor
memcpy(name, "Ben", sizeof (name));are not permitted in C++. (The first parameter of memcpy is a non-const void *). C compilers typically compile these statements with only a warning. As always, you can force the conversion to void * with an explicit cast.
Enumerations
In C, enumeration types are simply integral types. For example, given
enum color {RED, WHITE, BLUE}; enum color c; int n;then C permits assignments like
c = 1; n = BLUE;C even permits
enum day {YESTERDAY, TODAY, TOMORROW}; enum day d = RED;although some compilers issue a warning for this assignment.C++ treats each enumeration as a distinct type. Assignments like
c = 1; d = RED;are not allowed. C++ still promotes enumeration constants to integers, so
n = BLUE;is legal C++.When programming in C-, treat each enumeration as a distinct type. However, you can safely assume that enumerations promote to integers.
Linkage
In Standard C, a global data object may be declared repeatedly without the extern specifier, in the same program or even in the same compilation. In C++, a global data declaration without extern is a definition, and each global data object must be defined exactly once in a program. (Note that a global data declaration with an initializer - with or without the extern specifier - is also a definition.)For example, suppose an integer total is shared by several source files in a C++ program. total must be defined in exactly one of the files using any of the following:
int total; int total = 0; extern int total = 0;Any other source file that references total must declare
extern int total;When programming in C-, observe the linkage rules of C++. Make sure that the external declarations in your header files are indeed declarations (with explicit extern and no initializer) and not definitions. If you use an external definition in a header and include that header in more than one source file of a C++ program, the linker will protest against multiple defined names.In C, the default linkage for global const objects is extern; in C++ it's static. For example, suppose file f.c contains
const MAX = 100;and file g.c contains
extern const MAX;If you compile and link f.c and g.c using C, the MAX in f.c has external linkage and provides a definition to satisfy the reference in g.c. If you compile using C++, the MAX in f.c has internal linkage, and the MAX in g.c is an unresolved reference. In C-, use explicit storage class specifiers (extern or static) on all global const declarations.
Name Spaces
C++ has 16 more keywords than Standard C:
asm private catch protected class public delete template friend try inline this new virtual operato throwC- programs should not use these keywords as identifiers.C++ predefines the macro __cplusplus. Many C++ compilers (especially those based on AT&T's cfront translator) use names containing __ (a double underscore) in the translation process. C- programs should avoid identifiers with double underscores anywhere in the name. [In standard C, you should avoid even a single leading underscore. -pjp]
C puts all ordinary identifiers (such as function, type and variable identifiers) in a single (scoped) name space, and puts all tags (for enumerations, structures, and unions) in a separate name space. Thus, for example, C lets you declare a structure and a function with the same name in the same scope, like
struct tnode { char *word; struct tnode *left, *right; }; int tnode(const char *s);Tag names in C are not type names, so you can't write
tnode *t;You must carry the tag keyword around with the tag name, as in
struct tnode *t;You can simplify tag references in C by defining a type name with the same spelling as the tag, like
typedef struct tnode tnode;and write
struct tnode *t;as simply
tnode *t;In C++, tags are also type names. That is, you can declare
tnode *t;without also defining tnode as a typedef. For compatibility with C, C++ accepts
typedef struct tnode tnode;as well as
struct tnode *t;However, C++ does not allow a tag name to also be a function or variable name in the same scope, so that
struct tnode; ... int tnode(const char *s);is invalid.In C-, simply define each tag name as a type name in the same scope. For example, declare tnode in C- as
typedef struct tnode tnode; struct tnode { char *word; tnode *left, *right; };and use tnode (rather than struct tnode) in all subsequent references.In C, a tag name or enumeration constant defined within a struct (or union) has the same scope as that struct. In Listing 4, for example, tag names t and e and enumeration constants X and Y have the same scope as tag name s. Thus the declarations of ee and tt are perfectly valid, but
const int X = 0;is an error in C because X is already defined as something else.C++ compilers interpret Listing 4 differently depending on their vintage. According to the C++ Annotated Reference Manual [1] (based on the AT&T 2.1 C++ Product Reference Manual [4]), classes introduce a new scope. Since structs are just a special case of classes, nested type names and enumeration constants are local to the enclosing struct. By this rule,
const int X = 0;is valid. However, the declarations of ee and tt are invalid, because identifiers e and t are out of scope.Earlier versions of C++ (including those compatible with AT&T Release 2.0) export tag names defined in a struct to the scope of that struct, but keep enumeration constants local to the struct. By these rules, the declarations of ee, tt and const int X are allowed, but the reference to Y in the initializer of ee is undefined.
Although nested struct and enum definitions can be useful in C++, they're not much help in C. Given the variation in rules for nested definitions, your safest bet is to avoid nested definitions in C-.
Odds And Ends
C discards extra characters in a string initializer. For instance, C ignores the \0 at the end of "Dan" in
char name[3] = "Dan";C++ refuses to turns its back on defenseless characters. You must write the initializer as
char name[3] = {'D', 'a', 'n'};or face the wrath of the compiler.In C, sizeof('a') equals sizeof(int), but in C++, sizeof('a') equals sizeof(char). I have yet to encounter a situation in C- where this makes a difference. Please let me know if you find one.
A Step In The Right Direction
To a large extent, writing your C code so that it also compiles as C++ imposes no restrictions on the expressive power of C. Rather, it forces you to abandon poor coding practices, most of which are already considered obsolete by the C standard. The resulting code is just as efficient, but a little safer than Standard C, thanks to C++'s more rigorous type checking. By writing in C-, you are paving the road to the future. You can step up to a better C whenever you're ready.
References
[1] Ellis, Margaret A. and Stroustrup, Bjarne, The Annotated C++ Reference Manual. Addison-Wesley, Reading, MA, 1990.[2] Gimpel, Jim, "C-," The C Gazette, 4:4, Summer 1990.
[3] Ritchie, Dennis M. and Kernighan, Brian W., The C Programming Language. Prentice-Hall, Englewood Cliffs, NJ, 1978.
[4] AT&T C++ Language System Release 2.1 Product Reference Manual, AT&T, 1989.