P.J. Plauger has recently compiled his long-running "Programming on Purpose" column into a series of books of the same title. He can be contacted at pjp@plauger.com.
Speculating about programming languages is a popular indoor sport that inspires powerful passions, both in players and onlookers. Try telling a Boston Celtics fan that the team is ailing and well past its prime. If you have any teeth left after that exercise, try telling a C programmer that C is a primitive and obsolete language.
You have about as much luck convincing a C++ advocate that the language is deficient in any way. The arguments in its defense are legion, and all couched in intellectual terms. But the driving engine behind many of the statements is emotional conviction, pure and simple.
I find nothing wrong with either of these positions. I genuinely believe that C is far from dead as a popular programming language. I also am convinced that C++ has an interesting present and a bright future. And I have no problem with programmers forming emotional attachments to the tools of their trade. (I do prefer, however, that people learn to distinguish emotion from reason. Each works best in its own arena.)
When it comes to programming languages (and text editors, and operating systems), I am essentially Darwinian. It is much more fruitful, to me at least, to simply observe which languages:
Similarly, you can be the greatest object-oriented purist in the world, and hate C++ for all its ugly pragmatism and impurity of paradigm. (There, we got that word out of the way early.) Still, you really ought to notice that C++ is also well on its way to being a major success by the above metrics. And whatever object-oriented language is second in popularity is a very distant second to C++.
Here's where my Darwinian leanings raise a red flag.
For all its promise, C++ is also, unequivocally, a complex language--and history has not been kind to complex programming languages. Three that spring to mind are PL/I, Algol 68, and Ada. Each is a product of a different decade and a different (potential) user community, but each has followed a similar trajectory.
First there is the perceived need. Existing programming languages simply lack all the features now demanded by more sophisticated programmers, who now work on much larger projects. A small group forms, consisting of designers who inhabit that turbulent ocean midway between sophisticated users and experienced compiler writers. Before you know it, they have taken an existing lily and gilded it beyond recognition. The result shines brighter than what went before, but is also substantially heavier.
Then comes the business of selling. The dividing line blurs between the demand pull of putative customers and the supply push of vendors with a stake in the new technology. It's always the most sophisticated, and most outspoken programmers who sign up first. With all those good (stylish) ideas packed in there, the new language has something for just about anyone's taste. Invariably, production programmers are told repeatedly that they are on the verge of unemployment, or even extinction, if they don't switch over to this new technology as soon as possible.
The rising tide follows. Conferences fill up first with tutorials, then with more and more technical sessions on the wonderful new language. Magazines (like this one) run special articles, then special theme issues, then standing columns on the topic. New conferences and magazines emerge just to serve the new constituency. Books appear like mushrooms after a rain. Those who haven't had occasion to try the new language begin to doubt their personal sanity, or the viability of their current employers.
At the crest, programmers for this wonderfully complex new language command premium salaries. Recruiters steal company phone books to make cold calls on the experts. The most popular lunchtime game is the five-line program with the caption, "Betcha can't tell what this does." Maintaining and enhancing code in older languages goes from a second-class activity to third class.
Then the doubts creep in. A competitor beats you to market even though still mired in that old fashioned programming language of yore. Your enterprise suffers a few minor project disasters from complexity overload, or one really big one. Salary differentials get really out of hand. One by one, new projects get specified in terms of simpler programming languages, over the protests of the elite.
In the twilight days, the number of hotshots has been reduced to an absolute minimum. Now they're stuck on maintenance. Even worse, they have to suffer interviews by analysts charged with retooling the most useful pieces of the old, expensive systems in the current (simpler) language of choice. The hotshots grumble about management stupidity (hardly a new theme among programmers) and about how weak and unsafe programming languages have become. Their only consolation is that no programming language ever dies completely, thanks to past investments in code. They have a sinecure that will last until retirement, if they choose to stop advancing in their careers.
The consolation for the rest of us is that each language shapes the ones to come. Future designers steal the best bits and leave the worst failures to rot on the vine. Sic transit gloria mundi.
What causes some programming languages to follow this orbit? Well, in some ways, all languages do so. The ones I've singled out simply had a rise and fall that was faster and more spectacular than many people expected. In that sense, they were all oversold. And, as I conjectured at the outset, they are all overly complex.
So where does that leave C++? If it's following the inevitable fate of complex languages, then it's arguably still in the "rising tide" stage. We can trust that its popularity and importance are both still growing. But we can also trust that it will not be the programming language of choice in the year 2000.
On the other hand, historical analogies are never exact. C++ is firmly rooted in a highly successful language. It overcomes some of the recognized shortcomings of C at an opportune moment in history. Perhaps demand pull will be strong enough to rescue the language from gravitational collapse.
I can see that happening, however, only if we address the complexity issue wisely and in time. We need to appreciate just what sorts of complexity lead to untimely death, and what sorts of features turn out to be worth rescuing.
I opined at greater length about PL/I, Algol 68, and Ada in a magazine article several years ago. (See "Programming on Purpose: The Central Folly," Computer Language, April 1988. It also appeared in a collection. See Essay 4 in Programming on Purpose III: Essays on Software Technology, Prentice Hall, 1993.) I'll be brief in repeating the relevant parts of that essay.
My observation was that designers of complex languages like guessing games. They seem to feel that programmers want to write the absolute minimum for each piece of code. It is then up to the translator to decipher that terse code, by predictable rules, into an executable program. A good translator is one that can do the job with a minimum of clues from the source code.
APL and C are also terse, but not by the same metric. These two languages use lots of operators that need only a character or two. Thus, they encourage a style that is brief, even cryptic, at least to those who aren't comfortable with mathematical notation. The nearest thing to the kind of guessing games I'm talking about is the mixed-mode arithmetic permitted by C (and its predecessor Fortran). Depending on the types of the two operands, the compiler has to guess whether a + operator, for example, should perform an addition of floating-point, fixed-point, or pointer operands, in any of several possible representations for each. Even so, we're talking about a brief table, or a handful of rules.
PL/I inherited the same attitude from Fortran, but then went wild. First, it proliferated data types. Besides floating-point versus fixed-point (now with a scale factor) arithmetic types, PL/I lets you specify practically any combination of binary or decimal base, real or complex format, and a broad range of precisions. That handful of mixed-mode rules for Fortran explodes to pages of explanation--and any number of questionable guesses. A classic PL/I gotcha is the test IF 1 THEN .... The 1 is treated as a one-digit real fixed decimal constant. To make a Boolean test, the translator coverts it to the bit string 0001, then tests the leading bit 0. Thus, both 0 and 1 test false. An unreasonable conclusion from a series of apparently reasonable micro decisions. Yee hah.
C++ extends the guessing game one level farther. You can add to the set of permissible operands for the arithmetic operators. Just write "overloaded" definitions of the built-in operator functions with declarations such as myclass operator+(myclass&, myclass&).
Now the excitement really builds. A C++ translator has to consult lists of permissible operand combinations. Some are built-in, some user defined. Other lists prescribe chains of candidate conversions that might bridge between the actual operands and a permissible combination. Almost invariably, more than one combination of conversion chains and operator overloading fill the bill. That means the language has to decide whether the "best" choice is sufficiently better than the second best to favor it, or whether it is wiser to diagnose an ambiguity.
All of this machinery evolved from the simple desire to not require that a programmer write "obvious" type conversions explicitly.
PL/I also insisted on challenging the parser. The language has no reserved keywords, so you can write horrors like
IF IF = THEN
THEN THEN = ELSE
ELSE ELSE = IFEach of the four distinct tokens here has two quite distinct meanings.
C++ pushes parsing technology to the extreme. It has more operators than C, and even more with multiple meanings. You can't parse the language in a single pass, or with a finite amount of lookahead. In fact, the committee standardizing C++ still occasionally argues about certain abstruse parses. Most of the issues boil down to having the translator guess whether a sequence of tokens constitutes a type designation or an expression. A small problem in C has mushroomed into a major subtopic in C++. All to avoid having the programmer learn too many operators, or write long-ish keywords.
Which brings us to Algol 68. I could easily take that language to task for parsing problems, because its grammar has an infinite number of productions. But that amounts to shooting at life rafts. Having an endless grammar is only one of the ways in which Algol 68 took guessing games to new heights. Where it really stands out is in the shorthand it permits the programmer.
You know how to call a function in C, by writing something like f(x,y). If the function has no arguments, you write f(). Those empty parentheses are a clear signal that a function is being called. Under similar circumstances, Algol 68 lets you omit the parentheses and just write f to call the function. No big deal. Even Pascal is that tolerant.
But I'm not done yet. In C, if you want the contents of an object designated by a pointer, you write something like *px or **ppx. If you want to talk about the pointer itself, you write px or ppx, or even *ppx. Not so in Algol 68. That language wants you to omit the stars. It then guesses from context how many to put back. Take your favorite C or C++ program, erase all the empty parentheses and indirection operators, and see how readable a result you get.
The Algol 68 manual waxes eloquent for several pages, in fact, about the rules for second guessing the translator. If they're that hard to make clear, you can guess how hard they are for translator writers to get them right. Or for future readers of the code to guess the same way as the original author. And all to save writing a few parentheses and stars.
C++ does a little of this sort of thing, in places like the bodies of member functions. But it's nowhere near as ambitious as Algol 68 in this particular guessing game. Don't worry, it makes up for it in other arenas.
I mentioned operator overloading earlier, as a special case of function overloading. You can use the same function name with a host of different argument combinations. (They're called "function signatures" in conjunction with the name). The C++ translator then considers not just argument types, but also the number of arguments, and that ineffable figure of merit from above, in determining which of several samely yclept functions to call.
We're still not done. You can specify default argument values, as in double sinq(double angle, int quadrant = 0);. Leave off the trailing argument(s) and the translator supplies the default values for you. The matching rules now have to consider matches with and without various numbers of the default arguments filled in. I won't even mention the complexities involved in matching a function like printf that accepts a varying-length list of arguments.
And we're still not done. The C++ Standard now calls for two kinds of "template" facilities. One kind lets you parameterize class definitions--so you can describe, for example, a stack of arbitrary objects. The other kind lets you parametrize functions, as in template<class T> T *find_in(const T *, T);. Now if you write char *p = find_in("hello", 'l'); the translator has to guess what arguments to the template would result in arguments to the function that match the function call. (As humorist Dave Barry would say, "I am not making this up.") In this case, the answer should be that T stands for type char. Not a hard matter to figure out here. But the standards committee is still trying to figure out how hard a guessing game is arguably too hard to ask of a C++ translator.
And that leaves Ada, as the last of our odious comparisons. Ada, too, features operator overloading and some interesting parsing issues. But I find Ada's particular claim to guessing-game fame lies in its tolerance for name ambiguity.
In Ada, you can nest namespaces in ways both useful and exotic, then write a name with enough qualifiers to trace a clear path from where you write the name to where it is properly declared. But then you can start erasing qualifiers and leave it to the translator to figure out which ones got elided. The basic rule seems to be, "If the translator has any chance at resolving the ambiguity, then it must permit the ambiguity and endeavor to resolve it the way it thinks best."
C++ has adopted much the same attitude. The most interesting examples of the "betcha can't tell" variety involve guessing which of several candidate declarations matches up with a given utterance of a name. What makes life even more interesting is that the matchup can occur with a declaration later in the program. Only rarely do multiple candidates get diagnosed as ambiguous. Usually, the translator is obliged to favor one choice over all others, even if that isn't your favorite choice.
You might also be interested to learn that the C++ standards committee, at its last meeting in Munich, has voted to add namespace structuring to the language. Get ready for a host of new and interesting code snippets.
So what does all this mean for the future of C++? At the very least, it means that we must develop some rigorous coding rules for avoiding the worst surprises. A good start in this direction is C++ Programming Guidelines by Tom Plum and Dan Saks (Plum Hall Inc., 1991) and Scott Meyers' excellent book Effective C++ (Addison-Wesley, 1992). He states, and justifies, 50 rules that get you past some real pitfalls.
Pick a subset of the language that minimizes surprises, learn it well, and don't stray from it.
Subsetting works for users, but not translator writers. Vendors of C++ compilers still face mastering all that complexity, if only to pass validation suites and to run those "betcha can't tell" examples properly. It is a telling observation that you can still count the number of distinct C++ implementations on your fingers. You can count the number that purport to match last year's draft C++ standard on your thumbs. By contrast, at the same point in the standardization of C, dozens of implementors attended each committee meeting. And they represented perhaps half of all the implementations of C in the world.
Rex Jaeschke recently speculated in a different direction in these pages. (See "C/C++ Standardization: An Update," DDJ, August 1993.) He discussed the possibility that the more successful parts of C++ might find their way back into C. He went so far as to provide a list of candidate features.
Since that article appeared, Rex and I have attended the latest meeting of the ISO C Standards committee (WG14) in London. That group has decided to start work very soon on the next revision of the C language, even though the process doesn't officially have to start for another two years. There is simply too much pertinent activity in POSIX, internationalization, and C++ to stand idly by while everyone else is innovating.
Please don't take this as an invitation to send in your favorite extension (or fix) to Standard C. It will be months before WG14 even decides how to proceed on reviewing potential revisions to Standard C. But please do take this as an indication that C is a living and flexible language, even if it is also an international standard.
A very real possibility is that C may look like C++ sooner than many people have thought likely. If that thought frightens you, know that many of us who care about C are a bit frightened as well. We're not in a hurry to test the aerodynamic properties of a gold-plated butterfly.
I think by now you know my criterion for retrofitting concepts from C++. I have no problem with classes, overloading, templates, or even nested namespaces. All have their demonstrable uses. Where I personally draw the line is with the guessing games. Practically every issue I've raised with C++ in this article can be avoided. Just add the odd type cast, or template instantiation, or name qualifier to your program and the need for guessing games ends. C can remain a proper subset of C++ and still capture much of its added power. And it can stay relatively simple to implement, and to understand, by refusing guess the programmer's intent, beyond a reasonable doubt.
Will this scenario actually come to pass? Will C and C++ merge? Or will a new dialect, with a name like C+, emerge from the fusion? I can't say for certain, but it looks like a believable scenario. Your guess is as good as mine.
Copyright © 1993, Dr. Dobb's Journal