October 2002/The C/C++ Programming Language

Features

The C/C++ Programming Language

P.J. Plauger

From the trenches of a seasoned developer: C/C++ compatibility in practice.

Introduction

Much of my life revolves around writing C and C++ code. Most of that code is aimed at implementing libraries for the ISO C and C++ Standards. When I’m not actually developing code, or attending standards meetings, I find myself marketing the code that I, Pete Becker, and others write for our company, Dinkumware, Ltd. (I also have a life, which includes walks in the woods and attending many plays, concerts, and operas, but that is not the focus of this essay.)

One of the subtler forms of marketing involves tracking several of the public newsgroups. Pete and I participate in different sets, but we overlap with almost everything having to do with C or C++, standard or otherwise. I try not to lose sight of the fact that an active participant in one of these newsgroups is not necessarily your average programmer in the street. Nevertheless, what the loud participants tend to repeat as fact soon becomes gospel for the many silent lurkers. And the lurkers tell all their friends. I see the influence of newsgroup lore every day, in the articles written in trade magazines like CUJ, in questions asked at software conferences, and in the sales queries we receive by phone and email.

We’ve learned never to let misinformation go unchallenged, not if it damages the reputation of our products or compromises the credibility of the standards that are important to our business. We’ve also learned to identify problems in the C and C++ Standards by the kinds of questions that get asked repeatedly. And we even do a little market research for future products, though people seldom have a clear enough notion about what they really need to ask for it directly.

I mention the newsgroups to justify the title of this essay. Many postings refer to some program written in “C/C++.” The term usually seems to refer to a project that includes both C and C++ translation units, though on rare occasions I have also seen it applied to code intended to compile as either C or C++. (Header files are frequently written in this common dialect, replete with preprocessor tests on the macro __cplusplus to keep the unshareable bits apart.) Essentially all C++ compilers for desktop systems can also compile C equally well, so the term C/C++ arises naturally enough.

Nevertheless, it is a lightning rod for the purists. Use the term C/C++ in a posting and invariably one of the regulars will observe that no such language exists. There’s a C Standard first approved in 1989 and a C++ Standard first approved in 1998. There ain’t no language definition for C/C++. Perhaps the only hot button with more of a hair trigger is the proper way to declare function main.

To paraphrase Galileo badly, C/C++ still moves.

I can understand, to some extent, why the standards purists feel threatened by the concept of a C/C++ language. It is neither fish nor fowl nor good red meat. No organization has spelled out all the rules for writing real-world code that involves two or more languages, and probably multithreading, a shared library (DLL) or two, and several other APIs. Neither the C nor the C++ camp can claim dominion over such a hybrid dialect.

Nor would it do any good if they tried. Programmers who write in C/C++ seldom worry about complete conformance to either standard. Rather, they see the mix of languages as an expedient way to get many jobs done. They still appreciate language standards, because they minimize arbitrary differences between compilers, or between releases of a compiler. But very few programmers, in my experience, see any commercial advantage to writing code that’s completely conforming Standard C or Standard C++. It’s the difference between attending church on Sundays and joining the priesthood.

As a member of that priesthood myself, I have a little trouble making this confession. But I must admit that, in a real sense, the C and C++ communities sometimes suffer from too much standardization work. Nearly 20 years ago, I first observed that a programming-language standard is a treaty between implementers and programmers. That’s generally a good thing, particularly when the relationship gets adversarial. But even better is a healthy business relationship, where implementers eagerly produce software-development products that programmers eagerly make use of. Nations have worked for millennia to find the best balance between regulation and a free business climate; it ain’t easy. Despite my liberal tendencies, I now find myself favoring a more laissez faire application of standards to real-world programming projects.

Here is an example. Several issues back, Bjarne Stroustrup began a three-part series on the need to better reconcile the C and C++ Standards. He gave this famous introductory example from Kernighan and Ritchie:
main()
{
printf("Hello, world\n");
}
Stroustrup then lamented, “As a C89 program, Kernighan and Ritchie’s classic 'Hello World' has one error. As a C++98 program, it has two errors. As a C99 program, it has the same two errors, and if those were fixed, the meaning would be subtly different from the identical C++ program.” [1]

To be on the safe side of all current standards today, and as a matter of good style, you really should rewrite the code as:
#include <stdio.h>

int main()
{
printf("Hello, world\n");
return 0;
}
But not all these additions are needed in all cases.

Stroustrup’s observation stimulated a protracted discussion on a C++ newsgroup about the nature of the unspecified errors. There were almost as many different answers as there were posters. Not all agreed on the number or nature of the errors in each of the different dialects. And I don’t recall ever seeing a clear description of the subtle difference between the corrected C99 and C++98 forms.

If you see a half-empty glass here, you will join the lamentation. Worse, you will fret that quite a few putative language experts were unsure about which changes were really necessary for which dialects. But you can also complain that those standards people have gone and drunk half your water. This compellingly simple program got twice as big, thanks to a succession of standards. (I share part of the blame for two of the three changes, and I sympathize with the third.) It’s not clear how big a payback we’ve all gotten in return.

The fact is that the original example above works as is on the vast majority of C compilers in use today, if only to ensure backward compatibility for all that legacy code out there. Maybe you get a warning or two. Where you get an outright error, as you should in C99 or C++98, the fix is pretty clear. And unless you’re just trying to shut up the compiler, good style dictates that you make the code as robust as possible with each change. You don’t exploit the gray areas of a given dialect; you avoid them all. More to the point, once the corrected program compiles, it behaves the same in all cases. Any subtle differences are way too subtle for most of us to care about.

Now you know my bias. That should prepare you for the sequence of pronunciamentos that follow. I present each as a statement about real-world programming that I believe to be true. Space does not permit me to defend each statement properly. If you share most or all of my bias, you may tend to agree. If you don’t, you won’t. In all cases, however, I suggest you add a grain of salt. Your mileage may indeed vary.

Our Ever Changing Heritage

C/C++ has existed in a continuum of dialects for over 30 years.

I was one of the first C programmers at Bell Labs back in the early 1970s. I watched the language grow and change for over a decade before ANSI X3J11 began standardizing it. “Standardization” brought more change, as a matter of course. Just about the time Standard C began to settle down, C++ grew in popularity. It grew and changed even more radically throughout most of the 1990s as it too got “standardized.” Meanwhile, Standard C got corrected (1994), amended (1995), corrected again (1996), and then thoroughly revised (1999). As a result of all this churn, the first fully conforming implementations of C++98 and C99 are only now coming to market.

Amendment 1 has been the biggest source of C dialects over the past 10 years.

I hate to say this, having served as Convener of the C committee while this amendment was hammered out. I’ve also championed the technology by implementing it in a commercial product and explicating it in numerous essays. But the fact is that not even its original proponents care enough about it to make a real market for fully conforming products. Amendment 1 is even an official part of the C++ Standard, included by reference, yet almost nobody complains when great chunks of it are omitted.

I just spent several weeks packaging our C++ library for use with a variety of popular C libraries. I was appalled to discover how much variation there is in degrees of conformance. Nearly all match the C89 spec pretty well; only one that I tested (ours) fully supports Amendment 1, much less C99. Others do so to considerably varying degrees. (I know of two other libraries that should test well, but neither is very widely used today.) At least the Standard C language didn’t change until C99 came along.

So it seems that all the hard work done by the C committee throughout the 1990s has done more to proliferate dialects than to reduce them.

The C++ Standard has been the biggest source of C++ dialects over the past 10 years.

Unlike C, C++ doesn’t have an earlier standard to fall back on. Moreover, the C++ committee has been much more aggressive about adding features in the process of standardization. It is thus no surprise that C++ suffers even more from dialect proliferation than does C. Only recently have we been able to make a single Standard C++ library package that is fully functional with multiple compilers on multiple platforms. Even then, we still have to write a lot of #ifdef logic to gloss over differences. (And this for a language whose stated goal is to eliminate the need for a preprocessor.)

Only one fully conforming front end exists for C++, from Edison Design Group; the second one will not likely appear for years to come. Some people consider this a serious enough matter that they have proposed modifying the C++ Standard quickly to make it easier for others to conform. I hope a serious push in this direction doesn’t materialize, since it would only add to the fragmentation of C++ in the short term. We have enough dialects of C++ already.

The proliferation of dialects will continue for the foreseeable future.

You can’t get complacent. Both the C and C++ committees are currently working on Technical Reports that extend the two languages. Neither has the force of an official standard, but both are sure to stimulate still more “enhanced” releases as the new technology percolates out. I won’t even go into the on-going discussion of C++ revisions that has been fomenting on the newsgroups for the past year or so. The message is simple — don’t look for anything more stable than punctuated equilibrium in the evolution of C/C++, at least not as long as C and C++ remain such important languages.

The Status Quo

C remains the lingua franca of the programming world.

Something important gets easily lost in all this discussion of conflicting standards and proliferating dialects. The fact is that C89 remains the greatest common denominator for all multiple-language development systems in widespread use today. Program generators tend to spit out C. Why? Because every platform in the world has a handy C compiler, even if it doesn’t have C++. Other languages define bindings to C. Posix is defined primarily by its binding to C. I’ve even read books that give all the code examples in C without bothering to name the language used!

When I was asked to write a Java-to-C translator several years ago, I chose to write it in C. It was amazingly easy to forego the added power of C++ for the greater assurance that the code would port easily to numerous platforms. EDG’s C++ and Java front ends are both written in C. (For that matter, so is much of Java.)

I emphasize this to offset some of the boosterism for C++ (and Java) that tends to marginalize C today. Don’t get me wrong, I think both C++ and Java are often better choices than C, particularly for writing very large software products. But don’t expect C to go away any time soon.

C remains the workhorse language for embedded programming. (C/C++ is growing.)

I don’t give talks or write magazine articles like I used to, but I still make a point of attending two or three Embedded Systems Conferences every year. It’s a solid and growing market for our wares, and it’s a great place to recalibrate your perspective on the needs of programmers. Over the past decade, I’ve seen a decided shift away from assembly language toward C for writing embedded systems. I’m just beginning to see a similar shift toward C++. (Embedded C++ is still fulfilling its early promise as a useful steppingstone.)

But embedded programmers don’t care much about full standards conformance. You’ll still find a lot of compilers that ship with little or nothing in the way of a Standard C library. Some small systems barely support recursion. So if you want to strike up a Talmudic debate about the subtle differences between const, or void pointers, in C and C++, this is not the place to go to find enough rabbis.

C++ remains the workhorse language for desktop programming. (C/C++ is becoming less important.)

Desktop systems are much further along the complexity curve. People stopped counting bytes many years ago; some have stopped counting megabytes. What matters is quick time to market and easy revision, where even reasonably simple applications are programs numbering in the hundreds of thousands of lines of source code. This is C++ territory.

Standards matter more here, if only to keep down the cost of switching compilers or upgrading to a new release. Compatibility with C matters too. There’s still tons of legacy code written in C that won’t be converted to C++ for many more years, if ever. The legacy need is fading over time, but I believe that C/C++ programming is the wave of the present in this arena.

Compilers for the simpler dialects have way fewer bugs than those for the more elaborate dialects.

I don’t know of a desktop C++ compiler that doesn’t include a C compiler. Some people take this as an argument against preserving C as a distinct language. (It is also used as a reason why Embedded C++ was and is a pernicious distraction, but that’s another campaign.) Why have C compilers, the argument goes, if you can do the job just as well with a C++ compiler? The argument is, simply, wrong.

Most C compilers have had decades to be debugged. The C committee effectively froze the language in April 1985 and didn’t thaw it again until 1999. The additions are largely bolt-ons that have little effect on the existing language. The practical effect to me is that I’ve clocked very few hours over the past 20 years dealing with bugs in (pure) C compilers.

C++ compilers are quite a different kettle of fish. Rarely do I switch to a new release of a compiler without finding a new set of bugs to work around. Implementers are still refining the overload rules for template functions, and the lookup rules for namespaces. Both go to the heart of the language and affect practically all programs. It’s hard to get better conformance without risking greater instability. What’s worse, I often find that new releases of a C++ compiler introduce bugs into C. It’s not a free ride if C has to suffer the growing pains of its much larger, and less mature, companion.

That’s one reason why the embedded C compiler business is still flourishing. When such a compiler “grows up” to C++, it almost always does so by bolting on a separate front end, such as EDG’s excellent product.

Our Hope for the Present

It’s way easier to mix dialects at the functional interface level than within the same translation unit.

There’s a simple recipe for avoiding the subtler differences between C and C++ — keep them apart. Quite often, the C component of a C/C++ program is an API implemented as a few type definitions and a bunch of functions. It’s easy enough to write a header to define the former and declare the latter, and it’s not much harder to doctor such a header so it works with either C or C++ programs. That’s a comfortable, arms’ length relationship that’s easy to maintain over the years.

Of course, it would help the maintainers if the C and C++ coding rules are not surprisingly different. But they need not be brought much closer than they currently are to mitigate most dialect problems. The demand for more just isn’t there.

It’s way easier to avoid the gray areas than to clean them up.

There’s another simple recipe for avoiding the subtler differences between C and C++ — don’t go there. Programmers now take for granted that every project needs to follow a coherent set of style rules. These rules often prohibit many perfectly valid constructs. A prudent style is to avoid those areas that are likely to cause confusion when read by people schooled in different dialects of C/C++. We at Dinkumware happen to like our C code to be compilable as C++; but even when we never expect to need that virtue, we still try to make our C read like a subset of C++. It’s just safer that way.

This is simply the Groucho Marx rule rewritten for programmers. (Patient: “Doctor, doctor, it hurts when I do this. What should I do?” Doctor: “Don’t do that.”)

You can use #ifdef logic and #define to gloss over dialects within a translation unit.

There’s still another simple recipe for avoiding the subtler differences between C and C++, or any other dialects of C/C++ — write different versions for each, side by side. That’s what conditional directives in the preprocessor are good for. (Write any amount of Java and you soon miss them.) It is true that a forest of #ifdef logic can obscure a program, but a judicious sprinkling, neatly indented, can be quite maintainable. I’ve yet to see practical code so pure that it could avoid such machinery completely.

Compilers are converging, and libraries are glossing over the existing differences, better than ever.

Having described several practical techniques for dodging dialect clashes, I can also happily report that the problems are getting less severe. We do seem to be heading for a brief patch of equilibrium between jarring bits of punctuation. I’ve mentioned the more uniform state of C++ compilers. Dinkumware has also had good success in marrying C++ and C99 libraries. You don’t have to suffer from two flavors of complex arithmetic and two flavors of Booleans. The standards don’t clash as much as some people fear.

Our Hope for the Future

A few quick parting shots:

Pure standards conformance never has been, and never will be, the dominant factor in portability and reusability.

I’ve yet to see a nontrivial program that conformed perfectly to any language standard. And that’s neither a condemnation of standards or of the programmers who should follow them. It follows from the narrow scope of each standard. C and C++ chose not to address multithreading, shared libraries, or graphical user interfaces. They make a real difference where applicable, but their importance often pales in comparison to other portability considerations.

More attention is now being paid to C/C++ compatibility, and standards conformance, than ever before.

I’m not talking about standards committees here, but about vendors, teachers, and workaday programmers. A single vendor can still dominate a given marketplace, to be sure, but it is now important that even 800-pound gorillas pay some attention to standards conformance.

Resolving the worst incompatibilities is a relatively easy technical matter, a relatively modest political matter.

Don’t get me wrong, I strongly support any moves to reconcile differences between different dialects of C/C++. I’ll do my bit wherever I can. None of the issues are technically hard, but some require a bit of political face saving on one or both sides. Even that is doable if the standards committees perceive a demand for improved compatibility between C and C++. My main goal here is to put the effort in perspective.

C/C++ will remain a viable programming environment for years to come no matter what the standards organizations (or Microsoft, or Sun) choose to do.

I end with another questionable bit of solace. No matter how hard we standards drones try, we probably won’t make that big a difference in day-to-day programming. One way or the other. On the other hand, I don’t foresee that any single market force, however large or focused, is likely to make that big a difference either.

C/C++ will remain a useful language despite all attempts to help it, control it, or do it in.

Note

[1] Bjarne Stroustrup. “C and C++: Siblings,” C/C++ Users Journal, July 2002.

P.J. Plauger is president of Dinkumware, Ltd. He is the author of the Standard C++ Library shipped with Microsoft’s Visual C++. For eight years, he served as convener of the ISO C standards committee, WG14. He remains active on both the C committee and the C++ committee, J16. You can reach him at pjp@dinkumware.com.