Columns


Standard C

Formal Changes to C

P.J. Plauger


P.J. Plauger is senior editor of The C Users Journal. He is convenor of the ISO C standards committee, WG14, and active on the C++ committee, WG21. His latest books are The Standard C Library, published by Prentice-Hall, and ANSI and ISO Standard C (with Jim Brodie), published by Microsoft Press. You can reach him at pjp@plauger.com.

Current Status

The last meeting of the C standards committees, ISO JTC1/SC22/WG14 and ANSI-authorized X3J11, occurred jointly last December near Dulles Airport. I left that meeting with a warm sense of accomplishment and a humongous amount of homework. Both were a direct result of having the meeting go the way I'd hoped, for a change. In soap operas, senators and board chairpeople always finagle their political goals against all odds. In the real world, we lowly Convenors of standards committees mostly go with the flow.

I summarized the administrative highlights in my "Editor's Forum" last issue. (That was in CUJ March 1993, but see also the "Editor's Forum" for January 1993 and for November 1992.) Here again is a brief synopsis of what happened:

For these and other reasons, I have resigned as Secretary and member of X3J11. I get all the glory I need as Convenor of WG14, thank you. And I seem to have more than enough work as well. That homework I mentioned earlier has occupied me for over a month, off and on, since the last meeting. The responsibility lies with me to prepare the normative addendum for SC22 balloting as a Committee Draft (CD). I also inherit several years of X3J11 interpretations as a huge batch of Defect Reports to log and organize.

My aim in writing this report is not to win your symphathy. (But I'll take any I can get by the way.) Rather, it's to spell out the current formal activities in the C standard arena. Note that this report does not cover the work of X3J11.1 (a.k.a. the Numerical C Extension Group, or NCEG). That's because the charter of that subcommittee is to produce only a Technical Report (TR). In X3 land, a TR does not have the force of a standard. It is simply advisory. You'll hear more about X3J11.1 in future installments of this column.

I need to cover a lot of administrivia first, so please bear with me. I promise to give you a few technical details of what's happening to Standard C before I'm done.

The Normative Addendum

The normative addendum has, until recently, consisted of three contributions, each put forth by a separate member body:

The biggest change we agreed to last December was to delete the UK contribution from the normative addendum. Don't think we considered it unimportant — quite the contrary. Rather, we observed that the new machinery for handling Defect Reports offered more apropos vehicles for publishing the work of the UK delegation. So we threw this piece over the wall, as it were.

The other two pieces got final approval at the meeting. Both, however, suffered from a serious shortcoming. They needed to be translated into better "standardese." The Danish contribution evolved as a one- or two-page statement of intent. The Japanese contribution was remarkably refined, given the difficulty that English presents to the Japanese. But still there were places where the wording was a bit rough, or where more formal jargon was called for.

Lucky for me, Dave Prosser took it upon himself to correct these problems. As the final Redactor (editor) of the C Standard, Dave speaks standardese like an ISO bureaucrat. He also understands C better than practically anybody else I know. By the time he completed a pass over the normative addendum, I had little left to do except carp at details, then make a stack of review copies.

As of this writing, a review committee is checking our work. I will then submit the document to SC22 for CD balloting, once we get everyone's approval. By the time you read this, the balloting should be under way. My goal is to have the balloting period close shortly after the next X3J11 meeting (New York City in May), and before the next WG14 meeting London in July). That's all part of a little game of brinksmanship that we Convenors play all the time.

Defect Reports

Meanwhile, back at the ranch, I have this great stack of interpretations from X3J11. Four dozen Requests for Interpretation have percolated through ANSI official channels since the C Standard was approved in 1989. Over the years, X3J11 has patiently addressed and debated every one. The result has been two Technical Information Bulletins (TIBs) summarizing the RFIs and committee responses.

I described some of the earliest RFIs and responses in these pages way back when. (See "Standard C: A Matter of Interpretation," CUJ June 1990, and "Standard C: Interpreting the Nasties," CUJ July 1990.) Other people have also discussed some of the interpretations here and in other publications. Sadly, however, the TIBs have yet to be officially published by ANSI.

Now it looks like they never will be. An administrative foulup or two delayed the publication of TIB #1. Then ANSI switched over to the ISO C Standard and the situation changed. No longer was ANSI obliged to interpret the C Standard, since it was now an ISO document. Worse, it wasn't clear whether ANSI was even permitted to issue interpretations, under the agreement with ISO. TIB #2 sailed straight into the same swamp. Now both are mired in bureaucratic uncertainty.

We didn't want to lose all those probing questions to public view. And we certainly didn't want to waste the carefully crafted responses. So I accepted the obligation to treat each of the ANSI RFIs as a separate Defect Report. I've ensured that Defect Reports #001 through #048 correspond to ANSI RFIs #01 through #48. (And I've already been handed Defect Report #049 through a separate channel, even before the dust has settled on the changeover.)

I've built this 100-page (typeset) Defect Report Log. It contains the original ANSI RFIs, each accompanied by a "suggested response" — the response crafted by X3J11 for publication in a TIB. And remember all those examples from the UK contribution of the normative addendum? Well, I dealt them out as appropriate among the RFIs. Each example is labeled as a "suggested correction" to the C Standard.

That's not the end of it, of course. X3J11 developed most of its responses under a severe constraint. We were originally told that we could not change a single word of the C Standard. Even if a slight change of wording, or an added sentence, could clarify our intent without changing the language definition, we couldn't make the change. Thus, we put a lot of energy into rationalizing that you could read the C Standard the way we intended. That's not the best way to respond to a serious complaint from a confused questioner.

Now WG14 has machinery for making such clarifications, as Technical Corrigenda. The sentiment among many members of both WG14 and X3J11 is that we should not waste this opportunity. We could simply publish the two ANSI TIBs as an ISO Record of Response. That would get the interpretations out to the public (at last) fairly quickly. But it would leave us in the position of rationalizing bad standards language instead of fixing it.

So my task instead is to circulate this Defect Report Log among the membership of both committees. I hope that X3J11 can give us prompt guidance about the best way to respond to each Defect Report. Either we accept the explanation from the ANSI TIB, we include the example from the UK contribution, or we develop amended wording to clarify the C Standard. (I like to think that only one of these three options will suffice in each case.)

I hope for prompt guidance because this process has already dragged on for too long. The sooner we can clarify the gray areas of Standard C for the world at large, the happier I'll be.

The Danish Contribution

Now for a few technical details. The Danish contribution requires that all implementations of Standard C add a header called <iso646.h>. (The name honors the ISO standard which corresponds to ASCII, except that it permits certain graphics to be substituted for those we Americans know and love.) Listing 1 shows the contents of this header.

Note that you can use this file as is with any variant of ISO 646. It just prints funny on some national variant of that character set. The idea, in fact, is to confine most of the funny printing to just this header (which you should seldom feel moved to print.) You can then write:

   if (x != 0 || x != XMAX)
      .....
as

   if (x ne 0 or x ne XMAX)
      .....
and the code should be readable with any national variant. If that is not important to you, don't include the new header. Then none of the new macros conflict with any names you choose.

Besides this header, all implementations of Standard C must also recognize alternate spellings for six tokens:

<: :> <% %> %: %:%:
/* are the same as */
[ ] { } # ##
/* respectively */
Because they are just alternate ways to spell the same token, you can balance <: with }, if you want to be perverse. And if you "stringize" one of these alternate forms, you get a different result than when using the older token (or its trigraph form). Thus:

#define STR(X) #X
printf(STR( <: ) STR( { ) STR( ??< ));

prints <:{{.
Before you start writing letters, let me make a few observations:

The point is that this addition is not perfect. I'm pretty convinced after years of trying, though, that perfection is unattainable here. This particular approach is good enough. It can also be argued that the problem this addition solves is small and rapidly getting smaller. Others are pretty convinced, though, that it is still a problem worth solving. I believe the clutter is small enough that the rest of us should be tolerant.

The Japanese Contribution

By far the largest part of the normative addendum is the Japanese component. I count one new macro, three new type definitions, and 60 (!) new functions. The basic idea is to provide a complete set of parallels between functions that act on one-byte characters and functions that act on the newer wide characters. It's too bad we have to invent a whole set of variant names for these new functions. (That's one of the ways that C++ has improved code hygiene over C.) But I believe the time is ripe to introduce better wide-character support.

Windows NT traffics consistently in (16-bit) wide characters. It exemplifies the new trend toward supporting large and varied character sets. Multiple sets of 256 characters just don't cut it for systems and applications with an international market.

I plan to devote next month's column to a detailed look at the Japanese contribution. It's too big to due justice to in the space remaining here. For now, I'll simply summarize what it contains:

Some of this stuff sounds redundant, and it is. Still, there are good reasons for each of the additions. I'll do my best to convince you of that next month.