Kenneth Pugh, a principal in Pugh-Killeen Associates, teaches C and C++ language courses for corporations. He is the author of All on C, C for COBOL Programmers, and UNIX for MS-DOS Users, and was a member of the ANSI C committee. He also does custom C/C++ programming and provides SystemArchitectonicssm services. His address is 4201 University Dr., Suite 102, Durham, NC 27707. You may fax questions for Ken to (919) 489-5239. Ken also receives email at kpugh@allen.com (Internet) and on Compuserve 70125,1142.
To be or not to be
QThis may seem like a picky little detail, but I've been having some discussions about your column with my colleagues as to whether a function should be named to return true or false. For example, should you name a function that describes the state of a stack as is_empty or is_not_empty? If the former, you may find yourself always testing it with !is_empty().
Kevin Shearson
A
Isn't it funny how the little details seem to stir up the biggest battles? We programmers have probably spent more time and energy arguing over proper indentation, placement of braces, and capitalization/underscoring than in actual coding. At least we can switch indentation and brace placement around fairly quickly to suit the tastes of particular readers. Naming conventions are not simple to change and are therefore the subject of religious wars.
You can follow two rules in cases like this. The first is to use whichever form seems the most likely to require no additional operators (nominally the !). This is the law of "less typing is better coding." The other rule is to simply be consistent with function naming everywhere. If you follow this rule, a user of your code can always count on calling the "right" function, without having to look it up each time.
I'm not going to tell you which style to follow, but I do have a strong opinion in a related situation in the use of Booleans as parameters. I've come across many cases in which I'm compelled to replace a Boolean parameter with a different enumeration or with multiple function calls. For example, consider one vendor's implementation of a member function UpdateData, belonging to a class Dialog. If you create an instance of Dialog, as in
Dialog my_dialog;then this implementation lets you store data into my_dialog with
dialog.UpdateData(FALSE);From inside the dialog class, you can retrieve data with
UpdateData(TRUE)Unfortunately, the default for the parameter is TRUE, implying that retrieval occurs more frequently than storage. So this default works well with calls from inside the class (which are predominantly retrievals), and lousy with calls from outside (which are predominantly stores). It so happens that calls from outside are more common. I would rather have two functions such as:
dialog.StoreData(); RetrieveData();which would be at least more readable, if not far more understandable.
Old new stuff
QIn the April issue of CUJ you asked for reader comments on the following C++ statements:
char *pc5 = new (char *)[n]; char *pc6 = new (char (*))[n];Before I present my analysis for these strange things, I should warn you that despite my mail-address below I'm not at all a "C++"-Guru. Well, I started to work with C in 1981 and with C++ around 1989, but as thorough as my understanding of C is today, I'm regularly overwhelmed by the numerous ways in which C++ can be used.In case it matters: My C++ translation environment is Comeau C++ v3.0 12/01/91, more or less a plain port of AT&T USL C++ Language System v3.0 09/15/91, i.e. Stroustrup's cfront, which compiles C++ into C and uses the local C compiler to proceed.
Doing some experiments with this compiler and analyzing its C output reveals several things, which may give an idea what happens:
1) only the first of the above statements (pc5) compiles correctly, the second (pc6) is a syntax error in my version of cfront.
2) Adding another level of parentheses:
char *pc5 = (new (char *))[n];does not change the generated C Source.3) If the address returned from new is subsequently printed, the following two statements yield the same value:
char **ppc5 = & new (char *)[n]; char **ppc5 = new (char *) + n;(Of course, the above two statements must be compiled into separate programs that differ only in the above statements.)I think the major clue is the additional parentheses I added in step 2, and judging from your expertise in explaining C and pointers, I think I don't have to add any more words about what the above means and why pc5 will contain garbage with a probability of about 100%. Step 3 was only meant to verify my analysis from step 2.
Finally, I tried to parse the original statements of your April column by hand using the grammar from the Annotated C++ Reference Manual, but without success. Though I did not spend enough hours to be sure that my hand-parsing was absolutely correct, I think both statements are syntax errors and I strongly suppose a C++ compiler had to reject both.
Martin Weitzel
A
As readers may have noted in last month's column, the mystery behind the two statements was solved. As the judge might say, "case is closed." However your letter reinforces a few points that deserve repetition. C++ is a much more complex language than C. The ++ symbol represents more than just an increment of +1. I rather think it implies a factor of 10 to the first power. In fact, it is so complex that even with the base definition in the ARM, compiler writers have difficulty producing compilers that handle code consistently, as this problem has shown.
Another point is to avoid complicated declarations. The C committee took a long time to insure that the use of typedefs produced no unexpected results. If you use typedefs in C or classes in C++, you can avoid many of these problems. For example, if you use:
typedef char * CharPointer;the need for parentheses disappears entirely. You can then allocate with:
CharPointer *pc = new CharPointer[n];The KISS (Keep It Simple (really) Simple) principle makes for more understandable code.