Columns


Questions & Answers

Readability, Portability, And Coding Style

Ken Pugh


Kenneth Pugh, a principal in Pugh-Killeen Associates, teaches C language courses for corporations. He is the author of C Language for Programmers and All On C, and is a member on the ANSI C committee. He also does custom C programming for communications, graphics, and image databases. His address is 4201 University Dr., Suite 102, Durham, NC 27707.

Q

I would appreciate your comments on the following questions and problems:

1. Type char: signed or unsigned?

Most compilers consider chars as signed by default. We, European users, make extensive use of ASCII codes above 127 and the signed chars default does not seem to be the best choice. Which mode, in your opinion, is "better"? Why are constant chars considered as ints? The following:

char c = 'é';
if (c == 'é')
will work only if default char is unsigned. Otherwise, a cast to (char) is necessary to get the program to work, yet the constant é is clearly a char, not an int.

2. Good use or abuse of #defines and typedefs?

What does one think of the current practice of #defineing or typedefing native C types, like char into BYTE, unsigned char into BYTE or UBYTE, char * into TEXT, int into COUNT, int into BOOL, etc.

Is there really a reason for this (except (sometimes!) for portability, of course)? There is no such things (as far as I know) in the standard library header files! Moreover, when strictly prototyped programs are compiled the result is generally a long list of type mismatch errors (often pointer mismatches between (char *) and (unsigned char *)).

3. New C programming style

What do you think of the 'new' (?) C style programming, à la PASCAL, with (long) identifiers mixing lowercase and uppercase and banishing the underscore?

Thanks for your opinion and sincerely yours,

Hubert Toullec
Angers, France

A

In the ANSI C committee meetings, there was considerable discussion as to whether a particular feature of the language should be made right or whether backward compatibility should be preserved, to avoid "breaking" existing programs that used documented features of the language. If George Burns (in "Oh God") remade the world from scratch, he "would make the avocados with smaller seeds"; judging from the committee's discussion of this topic, remaking C is much more complex.

Several features were left unchanged for the sake of backward compatibility including the priority of the operators (even though some of the bitwise operators could be used more comfortably if the priorities were modified).

Similarly, the type of plain chars was specifically left unchanged and thus remains unspecified (i.e. not specifically typed as signed or unsigned). I agree with you that unsigned chars are more useful. I sometimes use the char type to hold small integer values, but they are usually non-negative integers.

The char data type has been converted to int since the early days of the language. That eliminates having separate rules for character arithmetic. Character constants should be treated the same way (signed or unsigned) as character variables. Note that standard ASCII includes only seven bit characters, so none of its values have the high order bit set. The C language does not specify that programs must run if you include non-ASCII characters. (Actually it specifies exactly which source characters are acceptable, but that basically is the ASCII set). With your example,

char c = 'é';
if (c == 'é')
you have used a character that is not specified as being standard. The compiler is not even obliged to compile the code. If you used the octal or hexadecimal escape sequence to represent the character, then the compiler would treat it as a regular character constant. I compiled with Quick-C and ran the program in
Listing 1 with one unexpected result. The results were:

Unequal -118 138
(char) Equal -118 -118
Hex Equal -118 -118
Hex (char) Equal -118 -118
Notice that the compiler treated both the char variable and the char constant as signed. However, it treated the non-standard character as a regular integer value. Some compilers provide a runtime switch on the interpretation of character variables. You might try using one that has such a switch.

On your next question, I am strongly in favor of using typedefs to define logical data types. Using typedefs is preferable to using #defines for consistency's sake, as there are many types which cannot be described in terms of a #define.

Declaring variables with typedefs captures a significant amount of information for the maintenance programmer. Unfortunately the C standard, in my opinion, does not go far enough in checking the use of typedefs. My favorite illustration is:

typedef SPEED double;
typedef TIME double;
typedef DISTANCE double;
SPEED compute_speed(time, distance)
TIME time;
DISTANCE distance;
     {
     SPEED speed;
     if (distance != 0.0)
         speed = time / distance;
     else
         speed = 0.0;
     return speed;
     }
and in another program:

SPEED car_speed;
TIME car_time;
DISTANCE car_distance;
   car_speed = compute_speed(car_time, car_distance);
   car_speed = compute_speed(car_distance, car_time);
Under the ANSI standard, both of these function calls are compatible, but logically one is erroneous. Some super lint or the compiler itself may one day use the typedef information for error checking.

I agree that there is a problem with the type checking performed when comparing or assigning unsigned char pointers and regular char pointers. This problem is most irritating when it forces you to write the declaration as:

unsigned char *string = "ABC";
with a cast as:

unsigned char *string = (unsigned char *) "ABC";
The ANSI committee debated whether it would be okay to not require such a cast in an initialization statement, but decided that consistency in typing was more important.

Of course, I strongly urge using full names for the type names, e.g. BOOLEAN instead of BOOL, etc.

On your final question, I am in favor of readable and meaningful variable and function names. Some people may have heard of studies that conclude otherwise, but ALongVariable-Name appears less readable to me than a_long_variable_name. The latter appears closer to what you would expect to read in normal text.

How much you should use abbreviations in naming is an open issue. The more abbreviations you use, the more you will have to remember and the more the maintenance programmer will have to infer and comprehend when reading the program. For example, XMT for transmit and TX for transaction may be common, but does CMP stand for compare or compute?

Q

I am developing a simulation program for study of our company's manufacturing plant using C Language compilers on IBM-PC/AT Machine.

I shall be thankful to you for sending information on various software tools in C language for incorporating graphics in the Program.

P.K. Gupta
Gujarat, India

A

The only package with which I personally have extensive experience is Essential Graphics by South Mountain Software, Inc., 76 So. Orange Avenue, South Orange, NJ 07079 (201) 762-6965 ($299 list, $230 street). You can distribute products built with Essential Graphics royalty-free, and you can use direct coordinates (your x,y values specify an exact pixel location) or world coordinates (your x,y values are transformed into a pixel location), the latter at some price in speed.

The names in this package are somewhat unintelligible, since the developers tried to stay with an eight character name. For example: grbx draws a box, grwx draws an x at a point, hsrect draws a rectangle with a hatch style and a label. As I mentioned above, I would prefer something like graph_box, graph_write_x, and hatch_rectangle_with_label.

Essential Graphics also supports loading and saving PC Paintbrush .PCX files.

There are several other packages on the market, including Halo Graphics and Advantage Graphics. Perhaps some of our readers may have comments on these or other packages.

Reader Responses:

Commodore 128

In the May 1989 issue of The C Users Journal, I took note of the questions by Mr. David Ockrassa regarding printing special characters such as the braces, vertical bar, and tilde on the Commodore 128. Before I started programming the Amiga in C, I dealt with the same problem.

The problem is two-fold in nature. Because these characters are not in the standard font set of the Commodore 128, the C language packages for that machine generally include an editor that re-defines several characters bitmaps to conform to the missing ones. These are saved with the file as a non-ASCII byte. The problem occurs when the file is printed, because the redefined characters may or may not have the same font set as that of the printer being used.

The solution is to write a small printer utility in C. The accompanying code (Listing 2) accomplishes this task, and is available on most commercial bulletin boards. I wrote several printer drivers of this type for the Commodore 128 for use with different printers that have a few more features than the included code such as pagination and filename/date headers.

John D. Clark
St. Louis, MO

MS Dynamic Data Exchange:

This letter is in response to Ken Libert's request for material concerning MS Dynamic Data Exchange.

If you contact Microsoft's product support services and ask for Windows Software Development Kit support, you can request their Application Notes concerning Dynamic Data Exchange. With this publication you get a disk complete with examples and source.

The DDEAPP example allows you to initiate a session with Excel and actually exchange cell data in multiple formats.

Tim Kuntz
University of Pittsburgh