Helene Ballay is studying computer science and electrical engineering at the Ecole Nationale Superieure des Telecommunications de Bretagne in France and has a programming background in Pascal, Ada, and C. She may be reached at ENST Bretagne ZA, BP832, F-29285 Brest Cedex, e-mail: ballay@enst.enst-bretagne.fr.
Rainer Storn has studied electrical engineering and holds a PhD from the University of Stuttgart, Germany. He is currently leading a group at Siemens AG, Germany which is working on the analysis, design, and implementation of communication systems. His company address is Siemens AG, ZFE ST SN 26, D-81739 Muenchen, Otto-Hahn-Ring 6. He can be reached via rainer.storn@2fe.siemens.de or storn@icsi.berkeley.edu.
Introduction
It is vital for the success of a software project that all programmers adhere to a certain programming discipline to foster efficient communication between people, prevent errors, and hence save valuable project time. As one means of achieving this for a C software project, virtually every company sets up its own coding conventions. These conventions are supposed to somehow tame C's loose typing, its latitude with pointers, and its high degree of orthogonality. The last item means that each language structure can be combined with nearly every other language structure to produce a meaningful and valid statement.The coding conventions from different companies and book authors are mostly the same. And a lot of those conventions lend themselves to automatic checking, a welcome relief for software reviewers who can henceforth concentrate on other important issues during code review. That is why we decided to write a tool for checking C coding conventions and make it publicly available. [Available on this month's code disk mb]
The C coding conventions that you will find in this paper do not form a complete set, but show only the subset which is easily checkable by a tool. A C convention will be indicated in the text by the notation "C<number>)" to separate it from the explanatory text. For example, C1) will indicate the first C convention, C2) the second one, and so on. For most of the conventions, we add a short explanation of why we think they are reasonable.
Including "stdtyp.h"
Listing 1 shows the file "stdtyp.h". You should include this in every C source file. It provides replacements for a group of forbidden keywords, which the tool reports on. The following conventions are checked:C1) The logical operators && and // shall be replaced by AND and OR to avoid confusion with bitwise operators & and /.
C2) Comparison for equality == shall be replaced by EQ to avoid confusion with the assignment operator =, comparison for non equality != shall be replaced by NEQ for homogeneity.
Consider the following example to understand the reason behind conventions C1) and C2):
while (c = ' ' || c == '\t' | c == '\n') { c = getc(f); }Two peculiarities can be found in this example which the programmer probably didn't intend but the compiler will not complain about because of C's orthogonality. These peculiarities wouldn't have shown up if you had stuck to the conventions and written
while ((c EQ ' ') OR (c EQ '\t') OR (c EQ '\n')) { c = getc(f); }As an aside note that parentheses have been used lavishly in this example in order to provide a more readable layout of the code. The parenthesizing, however, will not be enforced by the tool.C3) The keywords extern and static shall be replaced by IMPORT and LOCAL to obtain homogeneity. There is also a keyword EXPORT (see "stdtyp.h" for usage).
C4) The type names long, short, char, and int shall be replaced by the types defined in "stdtyp.h" (LONG, ULONG, SHORT, USHORT, CHAR, UCHAR, and BIT-FIELD) to facilitate portability between different processors and compilers.
C5) The name l (letter "1") for a variable shall be avoided because it is too easily confused with 1 (number one).
C6) The goto statement shall be replaced by other statements, such as break, return, and continue.
C7) C's shorthand notation (+=, -=, *=, /=) shall be replaced by an explicit notation because it is easier to read.
As a C expert you might not agree that, for example
sum = sum + a*b;is easier to read than
sum += a*b;In large projects, however, it is often the case that not all programmers are C experts. Pascal and FORTRAN veterans who have to adopt C generally find the implicit notation awkard to read, and have the feeling that it hampers their understanding of a program. Hence, for the sake of a team-oriented program development, this convention should be acceptable also to C experts.C8) The notation =- may be misunderstood by a C compiler's "greedy" parsing algorithm, so use = -(.....) instead.
The classic example by Ward, which shows what a missing space character can do and which justifies the above convention, is the code fragment
k=-(joe(14)+12*BYTES);which can be misinterpreted by a compiler's greedy parser, if it maintains compatibility with very old C conventions, as
k = k - (joe(14)+12*BYTES);although the programmer intended to write
k = -(joe(14)+12*BYTES);C9) Pointer to pointer shall be replaced by the index notation.This convention is again a concession to non C experts who find it much easier to read and write code which says, for example
j = twodim[2][7];instead of
j = *(*(twodim+2)+7);Even C experts start to adopt the array notation as more compilers treat the array notation as efficiently as the pointer notation.C10) Names starting with an underscore or names that contain a double underscore are reserved and shouldn't be used.
C11) Tabs shall be replaced by spaces, as different editors often exhibit different tab settings. Using tabs can result in an odd-looking source code layout when switching from one editor to another. The layout of the source code, however, is very important for ease of readability.
Right Places
Another group of coding conventions deals with the proper placement of C statements.C12) typedef, EXPORT, IMPORT, LOCAL, auto, register, and volatile shall be placed at the beginning of the declaration.
C13) Braces, { and }, shall be placed on a separate line. An exception to this are enum declarations and do-while loops. This convention is meant to foster the Allman style bracing layout or one of its variations. This style seems to offer the clearest visual source code structuring. It also helps during program development when you adopt the habit of first opening and closing a brace before you enter the first statement inside a conditional clause or a loop. It is good style to begin, for example, with
if (count > 0) { ..... } else { ..... }before you enter the statements belonging to the if or else branch. This remains true even if there is only a single statement in each branch. Quite often additional statements have to be added later, and also quite often programmers forget to add the necessary braces. Hence it is good practice to type the braces first.C14) The; should be at the end of a line, i.e., there shall be no more than one statement in a single line. Of course, for-loops constitute an exception to this convention, which is meant to increase the readability of the source code.
Naming Conventions
Naming conventions are very important in a software project. They increase the readability of the source code, and good readability prevents a lot of errors. Naming conventions facilitate communication among programmers. You understand each other's code more quickly. Unfortunately, most naming conventions are project or application specific. Nevertheless, you might find the conventions C15) and C16) useful.C15) Names of structures should all begin with S_, names of enumerations with E_, names of unions with U_, names of constants with C_ (an exception are hexadecimal constants, which should start with M_ because those are assumed to serve as bit masks), and names of typedefs with T_.
C16) A pointer name shall have the suffix _ptr. The tool checks pointers to the types defined in "stdtyp.h", pointers to functions, and pointers to struct, typedef, union, and enum (but only if their names follow the C convention C15)). It does not check names of pointers to a type defined in the define section of the C module.
Switch selection
C17) There shall be no program code between the switch statement and the first case label.C18) There shall be a break in every case selection of a switch statement. If it is necessary to deviate from this rule, extensive comments are mandatory (the tool always prints a warning if a break is missing).
C19) Every switch selection should exhibit a default label to encourage the programmer to deal also with the error case.
Miscellaneous
C20) Implicit tests for zero shall not be used, neither in normal code nor in preprocessor commands. The keywords that the tool is searching for are if, while, and elsif.The reason behind C20) is that it is more difficult to understand a statement like
while (func(p,q) && *in_ptr) { ..... }than a statement which looks like this:
while ((func(p,q) NEQ 0) AND (*in_ptr NEQ 0)) { ..... }As a side effect of this convention you will find that the tool sometimes detects typos if you have written an incomplete control flow statement which, by means of the typo, has changed into an implicit zero test statement.C21) Assignments in control flow statements shall be avoided.
This convention is one of those that many C programmers have objections to because it deprives you of a part of C's power. However, assignments in control flow statements often produce precedence errors and can render the source code awkward to read. Consider, for example, the code fragment which was basically taken from Koenig:
while (c = getc(in) NEQ EOF) put(c,out);Can you spot the error? The assignment operator has lower precedence than the comparison operator, so this fragment will not do what was intended.C22) Pointer value zero shall be implemented via the pointer constant NULL.
C23) When using functions returning status information (such as malloc, calloc, realloc, fopen(), etc.) the return values shall be checked for validity before the result is used. This test shall be done with an if instruction.
For example:
fin_ptr = fopen("INPUT.DAT","r"); if (fin_ptr EQ NULL) { printf("\nCannot open INPUT.DAT"); exit(1); }The tool exhibits a weakness here. No warning is issued if the if statement found just after the call of the function does not concern the returned value.C24) The address of a variable and the variable itself should not be used in the same statement, because the variable could be altered in an unpredictable manner, as there is usually no prescribed order of evaluation.
For example:
a = b + func(&b); /* order of evaluating b is not */ * prescribed, so this statement */ * might fail to serve its purpose. */ help = b; /* That's better. */ a = b + func(&help);C25) An enumeration type shall not be defined within a structure. It should be defined outside to make the code easier to read.C26) The ? : operator should be avoided, and constructs like if.....else shall be used instead to enhance readability of the source code.
The Tool cck
A tool called cck is provided with this month's code disk, and from the online sources listed on page 5. (It's much too large to reprint here.) cck is not protected by copyright and is in the public domain. The volume includes full source code, called cck.c, which is of course written according to the C conventions above. To use cck you must have stdtyp.h available as an include file. stdtyp.h is also provided along with cck. To use the tool, type:
cck <input-file> <output-file>The input file is your C program and the output file is the file where cck will print its results. Do not use the suffix .c for your output file. To avoid overwriting important files, an output file with the suffix .c stops cck. If the output file already exists, cck asks you if you really want to overwrite it.The first part of cck.c is based on cc.c (CUG 152.13) written by I. Jennings and David N. Smith. It counts opened braces, brackets, comments, and double quotes. It writes each line of the program being checked, preceded by the line number, the number of opened braces, brackets, comments, and double quotes into the output file. This helps locate where punctuation is missing or where extraneous punctuation has been added (see Listing 4) . If there are unbalanced braces, brackets, double quotes, or comments, an appropriate message is printed in the output file, the output file is closed, and cck ends.
If there are not enough comments in a function, a message will be printed to the output file showing the actual number of comment lines along with the desirable number. The ratio of comment lines to non-comment lines of code should be at least 20%. This ratio is determined between the opening and closing braces of every function in the source code. The value of 20% is rather low because it is assumed that an explanatory function header will be provided between the function name and the opening brace.
If everything is correct, the output file will be overwritten and no listing will be printed.
The second part of the program does some parsing and prints warnings (see Listing 3) . The parsing is based on grep.c (CUG 152.08) written by David N. Smith. cck checks the C coding conventions listed above and writes a warning for each violation into the output file.
Example and Conclusion
An example shows best what cck is doing. Listing 2 contains a small C program which doesn't look overly strange. The checker cck, however, complains about many things, as you can see in Listing 3. Listing 4 eventually shows a program version which satisfies cck. Of course cck is no substitute for error-detecting programs like lint, but it isn't meant to be. cck helps to encourage all programmers in a development team adhere to a given programming discipline and style. If this can be achieved, you have made an important step towards success in your software project.
References
[1] Storn, R., "Coding Conventions for C SW-Projects," Siemens AG, zfe st ack 21, Mar. 1993, company internal paper.[2] Ward, R., Debugging C, Que Corporation, 1986.
[3] Koenig, A., C Traps and Pitfalls, Addison-Wesley, 1989.
[4] Straker, D., C-Style Standards and Guidelines, Prentice-Hall, 1992.
[5] Plum, T., C Programming Guidelines, 2nd Edition, Plum Hall.
Sidebar: "A Matter of Style"