Programming Language Guessing Games

If C++ is the answer, what's the question?

P.J. Plauger

P.J. Plauger has recently compiled his long-running "Programming on Purpose" column into a series of books of the same title. He can be contacted at pjp@plauger.com.

Speculating about programming languages is a popular indoor sport that inspires powerful passions, both in players and onlookers. Try telling a Boston Celtics fan that the team is ailing and well past its prime. If you have any teeth left after that exercise, try telling a C programmer that C is a primitive and obsolete language.

You have about as much luck convincing a C++ advocate that the language is deficient in any way. The arguments in its defense are legion, and all couched in intellectual terms. But the driving engine behind many of the statements is emotional conviction, pure and simple.

I find nothing wrong with either of these positions. I genuinely believe that C is far from dead as a popular programming language. I also am convinced that C++ has an interesting present and a bright future. And I have no problem with programmers forming emotional attachments to the tools of their trade. (I do prefer, however, that people learn to distinguish emotion from reason. Each works best in its own arena.)

When it comes to programming languages (and text editors, and operating systems), I am essentially Darwinian. It is much more fruitful, to me at least, to simply observe which languages:

Sell themselves to programmers

Become widely used

Host applications that are commercially successful.

You can hate Cobol, Fortran, Basic, or C all you want. Still, you really ought to notice that these have had their successes by the above metrics. C, in fact, has to be the hands-down winner to date in the natural selection game.

Similarly, you can be the greatest object-oriented purist in the world, and hate C++ for all its ugly pragmatism and impurity of paradigm. (There, we got that word out of the way early.) Still, you really ought to notice that C++ is also well on its way to being a major success by the above metrics. And whatever object-oriented language is second in popularity is a very distant second to C++.

Here's where my Darwinian leanings raise a red flag.

For all its promise, C++ is also, unequivocally, a complex language--and history has not been kind to complex programming languages. Three that spring to mind are PL/I, Algol 68, and Ada. Each is a product of a different decade and a different (potential) user community, but each has followed a similar trajectory.

First there is the perceived need. Existing programming languages simply lack all the features now demanded by more sophisticated programmers, who now work on much larger projects. A small group forms, consisting of designers who inhabit that turbulent ocean midway between sophisticated users and experienced compiler writers. Before you know it, they have taken an existing lily and gilded it beyond recognition. The result shines brighter than what went before, but is also substantially heavier.

Then comes the business of selling. The dividing line blurs between the demand pull of putative customers and the supply push of vendors with a stake in the new technology. It's always the most sophisticated, and most outspoken programmers who sign up first. With all those good (stylish) ideas packed in there, the new language has something for just about anyone's taste. Invariably, production programmers are told repeatedly that they are on the verge of unemployment, or even extinction, if they don't switch over to this new technology as soon as possible.

The rising tide follows. Conferences fill up first with tutorials, then with more and more technical sessions on the wonderful new language. Magazines (like this one) run special articles, then special theme issues, then standing columns on the topic. New conferences and magazines emerge just to serve the new constituency. Books appear like mushrooms after a rain. Those who haven't had occasion to try the new language begin to doubt their personal sanity, or the viability of their current employers.

At the crest, programmers for this wonderfully complex new language command premium salaries. Recruiters steal company phone books to make cold calls on the experts. The most popular lunchtime game is the five-line program with the caption, "Betcha can't tell what this does." Maintaining and enhancing code in older languages goes from a second-class activity to third class.

Then the doubts creep in. A competitor beats you to market even though still mired in that old fashioned programming language of yore. Your enterprise suffers a few minor project disasters from complexity overload, or one really big one. Salary differentials get really out of hand. One by one, new projects get specified in terms of simpler programming languages, over the protests of the elite.

In the twilight days, the number of hotshots has been reduced to an absolute minimum. Now they're stuck on maintenance. Even worse, they have to suffer interviews by analysts charged with retooling the most useful pieces of the old, expensive systems in the current (simpler) language of choice. The hotshots grumble about management stupidity (hardly a new theme among programmers) and about how weak and unsafe programming languages have become. Their only consolation is that no programming language ever dies completely, thanks to past investments in code. They have a sinecure that will last until retirement, if they choose to stop advancing in their careers.

The consolation for the rest of us is that each language shapes the ones to come. Future designers steal the best bits and leave the worst failures to rot on the vine. Sic transit gloria mundi.

What causes some programming languages to follow this orbit? Well, in some ways, all languages do so. The ones I've singled out simply had a rise and fall that was faster and more spectacular than many people expected. In that sense, they were all oversold. And, as I conjectured at the outset, they are all overly complex.

So where does that leave C++? If it's following the inevitable fate of complex languages, then it's arguably still in the "rising tide" stage. We can trust that its popularity and importance are both still growing. But we can also trust that it will not be the programming language of choice in the year 2000.

On the other hand, historical analogies are never exact. C++ is firmly rooted in a highly successful language. It overcomes some of the recognized shortcomings of C at an opportune moment in history. Perhaps demand pull will be strong enough to rescue the language from gravitational collapse.

I can see that happening, however, only if we address the complexity issue wisely and in time. We need to appreciate just what sorts of complexity lead to untimely death, and what sorts of features turn out to be worth rescuing.

I opined at greater length about PL/I, Algol 68, and Ada in a magazine article several years ago. (See "Programming on Purpose: The Central Folly," Computer Language, April 1988. It also appeared in a collection. See Essay 4 in Programming on Purpose III: Essays on Software Technology, Prentice Hall, 1993.) I'll be brief in repeating the relevant parts of that essay.

My observation was that designers of complex languages like guessing games. They seem to feel that programmers want to write the absolute minimum for each piece of code. It is then up to the translator to decipher that terse code, by predictable rules, into an executable program. A good translator is one that can do the job with a minimum of clues from the source code.

APL and C are also terse, but not by the same metric. These two languages use lots of operators that need only a character or two. Thus, they encourage a style that is brief, even cryptic, at least to those who aren't comfortable with mathematical notation. The nearest thing to the kind of guessing games I'm talking about is the mixed-mode arithmetic permitted by C (and its predecessor Fortran). Depending on the types of the two operands, the compiler has to guess whether a + operator, for example, should perform an addition of floating-point, fixed-point, or pointer operands, in any of several possible representations for each. Even so, we're talking about a brief table, or a handful of rules.

PL/I

PL/I inherited the same attitude from Fortran, but then went wild. First, it proliferated data types. Besides floating-point versus fixed-point (now with a scale factor) arithmetic types, PL/I lets you specify practically any combination of binary or decimal base, real or complex format, and a broad range of precisions. That handful of mixed-mode rules for Fortran explodes to pages of explanation--and any number of questionable guesses. A classic PL/I gotcha is the test IF 1 THEN .... The 1 is treated as a one-digit real fixed decimal constant. To make a Boolean test, the translator coverts it to the bit string 0001, then tests the leading bit 0. Thus, both 0 and 1 test false. An unreasonable conclusion from a series of apparently reasonable micro decisions. Yee hah.

C++ extends the guessing game one level farther. You can add to the set of permissible operands for the arithmetic operators. Just write "overloaded" definitions of the built-in operator functions with declarations such as myclass operator+(myclass&, myclass&).

Now the excitement really builds. A C++ translator has to consult lists of permissible operand combinations. Some are built-in, some user defined. Other lists prescribe chains of candidate conversions that might bridge between the actual operands and a permissible combination. Almost invariably, more than one combination of conversion chains and operator overloading fill the bill. That means the language has to decide whether the "best" choice is sufficiently better than the second best to favor it, or whether it is wiser to diagnose an ambiguity.

All of this machinery evolved from the simple desire to not require that a programmer write "obvious" type conversions explicitly.

PL/I also insisted on challenging the parser. The language has no reserved keywords, so you can write horrors like

IF IF = THEN
     THEN THEN = ELSE
     ELSE ELSE = IF

Each of the four distinct tokens here has two quite distinct meanings.

C++ pushes parsing technology to the extreme. It has more operators than C, and even more with multiple meanings. You can't parse the language in a single pass, or with a finite amount of lookahead. In fact, the committee standardizing C++ still occasionally argues about certain abstruse parses. Most of the issues boil down to having the translator guess whether a sequence of tokens constitutes a type designation or an expression. A small problem in C has mushroomed into a major subtopic in C++. All to avoid having the programmer learn too many operators, or write long-ish keywords.

Programming Language Guessing Games

If C++ is the answer, what's the question?

P.J. Plauger

PL/I

Algol 68

Ada

Back to the Future?