Now that we have a standard for C++ libraries, we need to know how to determine which libraries conform to the standard.
Introduction
Last month, I began discussing the business of testing conformance of programming languages to various standards. (See "Standard C/C++: Testing Conformance," CUJ, March 2000.) Specifically, I focused on testing C and C++ implementations against their respective ISO Standards. I also have a commercial interest in testing our Java library for conformance, but the Java "standard" is rather more equivocal. (See "Standard C/C++: Java Standard Time," CUJ, January 2000.) I described, in general terms, several commercial validation suites that provide objective, if anecdotal, measures of conformance.
This time around, I confine my attention to testing conformance of a Standard C++ library to the recently approved ISO standard. It's a topic dear to my heart because, over the past eight years, I've written both the library and a validation suite for same. My company, Dinkumware, Ltd., now licenses assorted libraries and validation suites for C, C++, and Java. Of course, we like our customers to get both the suites are a great way to check that a library port hasn't broken anything but we're ecumenical. We license either without the other.
But this is a technical column, not a paid advertisement. My purpose here is to describe in some detail the Standard C++ library and what is involved in testing it. As I mentioned last month, more and more enterprises are beginning to claim conformance to the C++ Standard, which has now been stable for over two years. The Standard C++ library is pretty big and pretty new. It helps to know what "conformance" really means, in this context.
The C++ Standard
The official definition of Standard C++ is the document ISO/IEC 14882:1998. You can download a machine-readable copy of the whole thing from the ANSI web site for a mere $18. (See http://webstore.ansi.org.) The C++ Standard is organized into 27 clauses, plus five annexes. Two of the annexes are normative they tell you things you have to do in a conforming implementation. The remaining three are purely informative they do not impose requirements on a conforming implementation.
The first part of the C++ Standard focuses on the language proper. Here are the titles of the clauses:
1 General 2 Lexical conventions 3 Basic concepts 4 Standard conversions 5 Expressions 6 Statements 7 Declarations 8 Declarators 9 Classes 10 Derived classes 11 Member access control 12 Special member functions 13 Overloading 14 Templates 15 Exception handling 16 Preprocessing directives Also, Annex E defines "Universal character names for identifiers," which is normative.
Some of these chapters are relatively straightforward. Lexical conventions, Standard conversions, Expressions, and Statements to name just a few topics derive largely from Standard C. And the C Standard tinkered very little with those concepts since the first draft of Kernighan & Ritchie's classic opus. But other topics, such as Overloading and Templates, involve considerable invention beyond the days of C, and considerable complexity. Such topics require a deal of defining. Luckily, there's no law within ISO that says all clauses in a programming language standard have to be the same length.
The second part of the C++ Standard defines the library. Here are the titles of the clauses:
17 Library introduction 18 Language support library 19 Diagnostics library 20 General utilities library 21 Strings library 22 Localization library 23 Containers library 24 Iterators library 25 Algorithms library 26 Numerics library 27 Input/output library Also, Annex D defines "Compatibility features," which is normative.
As with the language part, these clauses range from the very simple, such as the Diagnostics library, to the very extensive, such as the Localization library and the Input/Output library.
I intend to review these library clauses in order, from the standpoint of writing a set of tests for conformance to the C++ Standard. Such a set of tests is commonly called a validation suite. As I discussed last month, a validation suite is the most pragmatic tool for determining the degree of conformance of an implementation to a given specification, such as the C++ Standard.
Listing 1 shows one such test, for the requirement in Clause 17 that headers can be included more than once. Of course, the test is hardly exhaustive. All it can do is include a couple of representative headers, and ensure that the header <cassert> gets the special treatment it deserves. The test itself consists of two chunks of code, one at file level and one inside function main.
Each of the chunks is wrapped with the same sort of ifdef logic, which allows a variety of conditional testing options when tests are run in batches. The header "defs.h" defines a number of functions and macros used for logging tests, such as begin_chk and end_chk, and for reporting the results of tests. The function ieq, for example, reports a failure if its two integer arguments are not equal. This particular code is representative of the tests in the Dinkum C++ Proofer, but you will probably find something similar in practically any validation suite.
Clause 17 Library Introduction
Clause 17 provides an overview of the library, defines a number of terms, and specifies several library-wide requirements. As such, it provides few objective statements that can be tested with executable code. For example, it makes pronouncements such as, "This library also makes available the facilities of the Standard C library, suitably adjusted to ensure type safety." Perhaps a tester can use this as an excuse to test the presence of Standard C library headers, but that opportunity comes often enough later. There's certainly no way to test whether the Standard C library has been "suitably adjusted," at least not without the details that follow. As standardese goes, Clause 17 contains a lot of unnecessary words.
A small island of testable wording appears about two thirds of the way into Clause 17. Subclause 17.4.2 talks about "Using the library." It lists all the required C++ headers and the ways they can be included in a program. It also discusses handler functions, such as the ones associated with set_new_handler, and replaceable functions, such as the simpler forms of operator new.
Some important tests are:
- whether all the headers are present, particularly those that depend on features added with Amendment 1 to the C Standard (<cwchar> and <cwctype>)
- whether any of the headers contain common reserved names, such as first or open
- whether handler functions can indeed be replaced
- whether replaceable functions are indeed replaceable
Clause 18 Language support Library
Clause 18 defines the contents of the headers <limits>, <new>, <typeinfo>, and <exception>.
The header <limits> defines a template class useful for summarizing the numeric properties of an arbitrary type. It also defines explicit specializations for all the builtin arithmetic types, including the character types and bool.
The remaining three headers define library entities that are known to the language translator proper. For example:
- A new expression calls various forms of operator new declared or defined in <new>.
- A typeid operator yields an object of type type_info, defined in <typeinfo>.
- An unexpected exception results in a call to unexpected, declared in <exception>.
Clause 18 also imposes additional requirements, beyond those of the C Standard, on the headers <cstdarg>, <csetjmp>, <cstddef>, <cstdlib>, and <csignal>. Mostly, these requirements are additional constraints on types and function arguments, to deal with small differences between the worlds of C and C++.
Finally, Clause 18 contains vacuous remarks about the headers <climits>, <cfloat>, and <ctime>. They add nothing to requirements already spelled out in other parts of the C++ Standard.
Some important tests are:
- whether <limits> accurately reports all the properties of the builtin types
- whether all the definitions of operator new and operator delete are present
- whether the replaceable versions of operator new and operator delete are indeed replaceable
- whether objects yielded by typeid have the expected properties
- whether class exception serves as the base class for exceptions such as bad_alloc, bad_cast, and bad_exception
Clause 19 Diagnostics Library
Clause 19 defines the contents of the header <stdexcept>. It defines a hierarchy of classes all based on class exception, defined in <exception>. The Standard C++ library uses a couple of these classes for reporting errors. For example, template class basic_string throws an object of class length_error if you attempt to construct an object too large to represent. The remaining classes are simply provided as a convenience for the programmer.
Clause 19 also contains vacuous remarks about the headers <cassert> and <cerrno>. They add nothing to requirements already spelled out in other parts of the C++ Standard.
Some important tests are:
- whether all the classes are defined, in the proper hierarchy
- whether each class defines the appropriate member functions
Clause 20 General Utilities Library
Clause 20 defines the contents of the headers <utility>, <functional>, and <memory>. These are all headers from the Standard Template Library, developed at Hewlett-Packard Labs and included as a block in 1994. The only wart is template class auto_ptr, which was added to <memory> at a later date and which is wholly incompatible with STL containers. Clause 20 also defines a number of terms, such as less-than comparison and copy construction, that apply across all STL headers.
The header <utility> defines template class pair, which is used throughout STL. It also defines the obvious template version of operator!= in terms of operator==, and template versions of operator>, operator<=, and operator>= in terms of operator<. The latter impose a total ordering on the operand types, which is also a fairly obvious set of definitions. But these four operators are relegated to namespace std::rel_ops, which renders them largely useless.
The header <functional> defines a slew of template classes and template functions that help you concoct function objects for use with various STL algorithms and containers. And the header <memory> defines template class allocator, which is the default allocator for all STL containers, and several other template classes and functions for managing memory.
Clause 20 also contains vacuous remarks about the header <ctime>. The remarks are slightly more extensive than those about <ctime> in Clause 18, but they still add nothing to requirements already spelled out in other parts of the C++ Standard.
Some important tests are:
- whether template class pair supports all the necessary constructors and conversions
- whether the template operators are suitably defined in namespace std::rel_ops
- whether <functional> properly defines all the varied template classes and template functions for concocting function objects
- whether template class allocator meets the basic requirements for an allocator object
- whether template class auto_ptr supports all the necessary (and mildly unusual) constructors and conversions
Clause 21 Strings Library
Clause 21 defines the contents of the header <string>. This header of course defines template class basic_string, which serves as the basis for the types string (an alias for basic_string<char>) and wstring (an alias for basic_string<wchar_t>). The header also defines template class char_traits and a couple of explicit specializations, for character types char and wchar_t. This template class supplies the default character traits for template class basic_string and for a number of the iostreams template classes defined in Clause 27.
Clause 21 also imposes additional requirements, beyond those of the C Standard, on the headers <cstring> and <cwchar>. Specifically, it replaces several function signatures in each header with pairs of signatures that offer improved type safety.
Finally, Clause 21 contains vacuous remarks about the headers <cctype>, <cwctype>, and <cstdlib>. They add nothing to requirements already spelled out in other parts of the C++ Standard.
Some important tests are:
- whether the explicit specializations char_traits<char> and char_traits<wchar_t> have proper behavior
- whether template class basic_string lets you supply an alternate form of character traits
- whether template class basic_string properly defines all its numerous conversions and searches
- whether the specializations basic_string<char> and basic_string<wchar_t> behave as expected
- whether it is possible to specialize template class basic_string for a character type other than char or wchar_t
Clause 22 Localization Library
Clause 22 defines the contents of the header <locale>. The header defines class locale, which encapsulates all the information traditionally associated with a Standard C locale, in an extensible format. Specifically, the information in each of the locale categories of Standard C is captured in one or more locale facets. A locale object is a non-mutable collection of references to facet objects. The header <locale> defines a number of template classes that specify two dozen or so standard facets. The programmer can derive new facets from specializations of these template classes, or introduce entirely new facets when constructing a locale object.
The template classes in Clause 22 parse numeric input and generate formatted numeric output for the iostreams classes. Other template classes do much the same for times and dates, and for monetary quantities. They also categorize characters, much like the traditional Standard C headers <ctype.h> and <wctype.h>. And they encapsulate rules for converting between byte streams and wide-character encodings, as when performing file input/output. Thus, Clause 22 is very large and touches on many aspects of input/output conversion and categorization.
Clause 22 also contains vacuous remarks about the header <clocale>. They add nothing to requirements already spelled out in other parts of the C++ Standard.
Some important tests are:
- whether multiple locale objects can indeed coexist
- whether all facets are implemented and behave as required
Clause 23 Containers Library
Clause 23 defines the contents of the headers <bitset>, <deque>, <list>, <map>, <set>, <vector>, <queue>, and <stack>. All but the first header define containers or container adapters for the Standard Template Library. Clause 23 also presents a number of requirements that are common to various subgroups of the STL containers.
The header <bitset> defines template class bitset, which describes a fixed-length sequence of Boolean elements intended as a replacement for the integer mask variables of traditional C programming.
Some important tests are:
- whether all containers are implemented and behave as required
- whether all containers use allocator objects as required
- whether containers have the required time complexity
Clause 24 Iterators Library
Clause 24 defines the contents of the header <iterator>. It defines a number of template classes and template functions that aid in constructing and categorizing STL iterators. Clause 24 also presents a number of requirements that are common to various categories of the STL iterators.
Some important tests are:
- whether all iterator template classes are implemented and behave as required
- whether iterators can be categorized as required
Clause 25 Algorithms Library
Clause 25 defines the contents of the header <algorithm>. The header defines a slew of template functions for performing nearly all of the STL algorithms operations on sequences delimited by iterators. It is thus a very large clause, but one with subclauses that are only loosely related.
Clause 25 also imposes additional requirements, beyond those of the C Standard, on the header <cstdlib>. Specifically, it replaces the lone function signatures for bsearch and qsort with pairs of signatures that offer improved flexibility in defining comparison functions.
Some important tests are:
- whether all algorithms are implemented and behave as required
- whether all algorithms use iterators of the appropriate categories as required
- whether algorithms have the required time complexity
Clause 26 Numerics Library
Clause 26 defines the contents of the headers <complex>, <valarray> and <numeric>. Clause 26 also presents a number of requirements that are common to the "value" types permissible as template parameters for template classes complex and valarray.
The header <complex> defines template class complex, for performing complex arithmetic, and explicit specializations for the three floating-point types.
The header <valarray> defines template class valarray, for performing numeric array operations that require various "slicing" operations.
The header <numeric> defines a handful of template functions for performing the numeric STL algorithms on sequences delimited by iterators.
Clause 26 also imposes additional requirements, beyond those of the C Standard, on the headers <cmath> and <cstdlib>. Specifically, it overloads a number of functions for arguments of additional basic types. All the functions declared in <math.h> must also be provided for arguments of type float and long double.
Some important tests are:
- whether all three explicit specializatons of complex are implemented to the required precisions
- whether all of the indexing (slicing) operations on valarray behave as required
- whether all the numeric algorithms are implemented, use iterators of the appropriate categories, and have the required time complexity
- whether all three sets of <cmath> functions are implemented to the required precisions
Clause 27 Input/Output Library
Clause 27 defines the "iostreams" classes based on the traditional C++ library classes for performing input/output. Specifically, it defines the contents of the headers <iosfwd>, <ios>, <istream>, <ostream>, <streambuf>, <iomanip, <fstream>, <sstream>, and <iostream>. Clause 27 also presents a couple requirements that are common to several of the iostreams classes.
Needless to say, this is a very large clause. It is half again larger than Clause 22 Locales, which edges out Clause 25 Algorithms as the second largest of the library clauses. And unlike these runners up, Clause 27 defines classes that are highly interrelated, among themselves and the features defined in Clause 22. Isolating features for separate testing is thus all the more difficult.
Clause 27 also contains vacuous remarks about the header <cstdio>. They add nothing to requirements already spelled out in other parts of the C++ Standard. Finally, Clause 27 threatens to say something about the headers <cstdlib> and <cwchar>, but never delivers on the threat.
Some important tests are:
- whether the basic machinery of template classes basic_ios and basic_streambuf behaves as required
- whether the derived classes for input/ouput to files (<fstream>) and strings (<sstream>) behave as required
- whether classes such as basic_istream<char> and basic_ostream demonstrate the proper dependency on "imbued" locale objects when parsing and generating formatted text
- whether classes such as basic_fstream<wchar_t> demonstrate the proper dependency on imbued locale objects when converting to and from external byte streams
- whether file-positioning operations behave as required for text and binary files
Annex D Compatibility Features
Annex D defines a number of features that are deprecated still part of the C++ Standard but candidates for future removal. These include a few language features plus several library features:
- the use of the Standard C headers, such as <stdio.h> (You are encouraged to use <cstdio> instead.)
- a handful of traditional type definitions, such as ios::io_state (You are encouraged to use ios_base::iostate instead.)
- the header <strstream> (You are encouraged to use the header <sstream> instead.)
Some important tests are:
- whether all Standard C headers are present
- whether names are defined in the appropriate namespaces among the Standard C and Standard C++ headers
- whether function overloading can distinguish pairs of types such as ios::io_state and ios_base::iostate
- whether the derived classes for input/ouput to character buffers (<strstream>) behave as required
In Practice
Performing all the tests outlined above requires tens of thousands of lines of code, distributed over a couple of thousand tests. Please note that such conformance tests are still merely "anecdotal" they make no attempt to be exhaustive. Instead, they rely on the testing of representative, or critical, cases to show up likely failures. Lots of other forms of testing can also be appropriate:
- If your goal is, say, to hunt down marginal bugs in output formatting functions, you might want to supplement such tests with a test generator that tries many variations on a theme. (This is a common technique when testing expressions in the language part of a validation suite.)
- If your goal is, say, to unearth performance problems and memory leaks under extreme conditions, you might want to add stress tests. These are particularly useful for exercising the STL algorithms and containers.
- If your goal is to check for strict conformance to the C++ Standard, you might want to add deviance tests tests that should fail to compile, or that should throw an exception at runtime that is not strictly necessary. A strict implementation can be useful when developing code intended to be highly portable.
We at Dinkumware are continually on the lookout for such useful tests to supplement what we already use. We write such tests, or commission them, when the need becomes apparent. But mostly, we've had good success with just the couple of thousand conformance tests in the Dinkum C++ Proofer.
I should point out, before I quit, that the Dinkum C++ Library, in conjunction with the Dinkum C Library, is the only library we've found so far that has successfully passed all the tests in the Proofer. For the past two years, the C++ front end from Edison Design Group has been complete enough to compile a fully conforming Standard C++ library. Since the EDG front end is widely used, that means many compilers are capable of full library support today. Of course, several gray areas remain in the interpretation of the C++ Standard, so there's always room to quibble about what full conformance means. But, as a practical matter, any past worries about whether the Standard C++ library is fully implementable have been laid to rest.
As the mathematicians would say, a solution exists.
P.J. Plauger is Senior Editor of C/C++ Users Journal and President of Dinkumware, Ltd. He is the author of the Standard C++ Library shipped with Microsoft's Visual C++, v5.0. For eight years, he served as convener of the ISO C standards committee, WG14. He remains active on the C++ committee, J16. His latest books are The Draft Standard C++ Library, Programming on Purpose (three volumes), and Standard C (with Jim Brodie), all published by Prentice-Hall. You can reach him at pjp@plauger.com.