C/C++ Contributing Editors


Standard C/C++: Why 2K?

P. J. Plauger

Okay, so the millennium is almost upon us. But what does that have to do with the C and C++ standards?


We're coming down to the wire. By the time you read these words, you'll probably already have planned where you're going to spend New Year's Eve. The Big Odometer rarely turns over even two nines at once in a person's lifetime. Watching it turn over three of 'em is a pretty big deal. The purists can argue all they want about when the next century begins. All the rest of us will be partying big time, even if we're theoretically a year early.

As a pundit, I'm fast running out of room for punditry. I may be the only nominal computer expert on the planet who hasn't already milked at least one Y2K contract for a bit of extra spending money this year. As a civilian, I haven't even stored up canned goods and/or gold coins. (Though I am seriously considering getting in an extra gallon of milk on December 30th. Unless my teenage son is out of the house by then, as he earnestly plans.) I keep telling interviewers that I think the Y2K problem is overblown. For some reason, they don't call me back for follow-up interviews.

I do intend to save all my copies of CUJ. If I'm wrong about the impending societal disaster wrought by rogue software, I may well be burning them for fuel soon. Winters can be cold in Concord, Massachusetts. But truth to tell, I always save my copies of CUJ anyway. And PC Magazine offers a lot more BTUs per issue. It is my emergency fuel supply of choice.

So where does that leave me as a pundit? About where most other programmers are — stuck somewhere in the middle. As a vendor, I can't sell anything unless I promise that the products I supply will not add to the disaster. As a customer, I can't buy software unless I promise not to sue if the products I buy go nova in the next year or so. Maybe somebody out there besides me has some residual liability for Y2K software failures, but I suspect he's shoveling chicken manure most of the time and doesn't yet know he needs special insurance.

And that brings me to the serious issue I want to discuss this month. What are the implications of the turn of the millennium to code that endeavors to conform to the C or C++ standards? As a library vendor, I have had to ask that question, in one form or another, several times over the past few months. In some cases, it's lawyerly CYA that stimulates the query. Buyers the world around are setting up to blame their suppliers if anything goes wrong. But in other cases, it's genuine concern on the part of programmers. They want to know whether Standard C or C++ libraries will sabotage their best efforts to make their code Y2K safe. I can only sympathize with that desire.

To some extent, we can argue that this is a non problem (and we do). The relevant standards dictate how a conforming implementation is supposed to work. In some cases, it dictates that years are represented as two decimal digits. Aha, say the doom sayers. This is just the sort of thing that is going to cause airplanes to fall from the sky. But no, say us standards wonks. The standards dictate the use of two-digit dates in some cases. It is not an accident or short sightedness. A program that hopes for more than a two-digit date in these cases does not conform to the standard. You can't hold a conforming implementation responsible for problems that arise from its use in nonconforming ways.

Several companies provide validation suites for testing conformance of C and C++ implementations to the ISO standards. (Mine happens to be one of them.) Compiler and library vendors strive continually to get high marks when tested with these suites. We vendors can rightly argue that we have performed due diligence — a nice lawyerly term — in ensuring that our code meets specifications.

Do these suites test specifically for Y2K bugs? In the case of other suite vendors, I don't honestly know. My experience, however, is that they test for far more than problems that might occur in the next couple of months. So I am pretty confident that a validated library is as Y2K compliant as the C or C++ standards require. Looks like due diligence to me.

Still, it doesn't hurt to take a close look at where Y2K problems might arise in the standard libraries. Even if problems don't occur in the libraries proper, those potential trouble spots highlight where a program can depend on dates. And that tells you where to look for potential Y2K problems in the program.

Standard C Library

The people who first invented Unix and C were amateur astronomers. (See the chapter on <time.h> in my book, The Standard C Library, Prentice-Hall, 1992.) Thus, both the OS and the language were precocious in their treatment of times and dates. From the outset, Unix has stored times as a 32-bit signed count of seconds since 1 January 1970. Moreover, all stored times are in terms of Greenwich Mean Time (GMT, which has since become UTC). Dates are explicitly converted, for display, to a specified local time zone.

By contrast, most other operating systems keep track of local time. Hiccups are inevitable as you move in and out of daylight savings time. Files exchanged across time zones require careful adjustment of last-modified dates, or they raise havoc with programs that reconcile file versions.

The C library started out on Unix, inheriting its attention to such details. As we developed the C Standard, however, we had to dilute the requirements for the time functions more than a little. It is, after all, a standard for C that should apply across all operating systems.

Truth to tell, the C Standard promises very little about the ability to tell time. In principle, a Standard C library could report all times as January 19, 1987 and arguably comply. But in practice, market forces dictate that a decent library do its best to tell time and keep track of a reasonable range of dates with some precision.

All the time-sensitive functions in the Standard C library are declared in the header <time.h>. Listing 1 shows a synopsis of this header. It defines three types that have some effect on representing times. (The type size_t is defined for convenience in several standard headers. It has no direct involvement in representing times.)

If time_t is represented as a signed 32-bit count of seconds, there is no Y2K problem here. But there is certainly brewing a Y2.037K problem. Signed integer overflow occurs after about 68 years worth of seconds with this representations. Of course, the C Standard does not require that time_t be a 32-bit integer, or that it represent times to one second resolution. So there is more than one way to head off this problem, and nearly four decades in which to do it. For my part, I hope to be retired before then. I leave the implementation to you.

struct tm is probably even less of a problem. It counts the years since 1900 as a signed int, in the field tm_year. Even with 16-bit ints, that allows for a broad range of dates. Note that tm_year is not just a two-digit year count. The year 2000, for example, is represented as the value 100, and 2001 is 101. No confusion or ambiguity here.

The functions that care about what year it is are:

Of these, only the last offers much opportunity for a genuine Y2K problem. Two of its format specifiers are:

If you convert dates with the former, you will indeed create ambiguities between years like 1900 and 2000. But you won't be unearthing Y2K problems in the library. It just does what it's told to do.

Standard C++ Library

The Standard C++ library sits atop the Standard C library, both conceptually and in practice. That means it's likely to be no better — and no worse — in dealing with dates. The time-sensitive functionality is all encapsulated in a set of classes and template classes defined in the header <locale>:

class time_base;
template<class E, class InIt>
    class time_get;
template<class E, class InIt>
    class time_get_byname;
template<class E, class OutIt>
    class time_put;
template<class E, class OutIt>
    class time_put_byname;

Class time_base simply encapsulates an enumeration which is common to the various template classes. Template class time_get_byname lets you construct a time_get object whose behavior is determined by some named locale. And template class time_put_byname performs much the same service for time_put. These three entities otherwise have no direct effect on the representation of times in a C++ program.

Listing 2 shows a synopsis of template class time_get. It is a kind of locale facet. (See Standard C/C++: The facet time_get, CUJ, July 1998.) I won't describe facets in any detail here, having spent about a year discussing this extensive topic. Just know that a facet encapsulates rules for converting between text representation in files or streams and internal encoded forms. In particular, time_get is the agent responsible for parsing text and turning it into an internally encoded date and time — as a struct tm, in fact.

The function at the heart of any Y2K issues is the protected virtual member function do_get_year. It reads text and endeavors to parse a year number, for our old friend tm_year. How does it do it? The C++ Standard is characteristically laconic when it comes to the particulars of this or several other facets. All it says is:

Reads characters starting at s until it has extracted an unambiguous year identifier. It is implementation-defined whether two-digit year numbers are accepted, and (if so) what century they are assumed to lie in. Sets the t->tm_year member accordingly.

We could take this to mean that 100 stands for the year 100 AD (or CE, in politically correct Newspeak). That is arguably unambiguous to a Christian, I suppose. But I'm willing to bet that someone trying to make old code Y2K compliant would prefer a more liberal interpretation. Here's what our documentation says, by way of more detail:

The year input field is a sequence of decimal digits whose corresponding numeric value must be in the range [1900, 2036). The stored value is this value minus 1900. In this implementation, a numeric value in the range [0, 136) is also permissible. Values in the range [0, 69) represent the range of years [2000, 2069). Values in the range [69, 136) represent the range of years [1969, 2036).

Note the attempt to ward off problems with the Y2.037 bug by warning of the limited valid range. Note also the meanings given various small numbers. The idea is to give the most likely colloquial meaning to dates within a few decades of the millennium, plus or minus. Two caveats, however:

Finally, Listing 3 shows a synopsis of template class time_put. It is not very revealing, but I provide it for completeness. What you need to know is that the protected virtual member function do_put is the agent responsible for turning a struct tm into text, under control of a format specifier.

The format specifiers happen to be the same as for the Standard C library function strftime. A sensible implementation actually calls this function to do the work (though with a bit of finagling, I hasten to point out). So time_put is not likely to introduce any new opportunities for Y2K problems.

I hope this brief litany convinces you that the C and C++ libraries are not likely to be a rich source of Y2K bugs. The harder job, of course, is convincing your boss. And your customers. We'll all know more in a matter of months, I suppose.

P.J. Plauger is Senior Editor of C/C++ Users Journal and President of Dinkumware, Ltd. He is the author of the Standard C++ Library shipped with Microsoft's Visual C++, v5.0. For eight years, he served as convener of the ISO C standards committee, WG14. He remains active on the C++ committee, J16. His latest books are The Draft Standard C++ Library, Programming on Purpose (three volumes), and Standard C (with Jim Brodie), all published by Prentice-Hall. You can reach him at pjp@plauger.com.