May 1998/We Have Mail

Departments

We Have Mail

Letters to the editor may be sent via email to cujed@mfi.com, or via the postal service to Letters to the Editor, C/C++ Users Journal, 1601 W. 23rd St., Ste 200, Lawrence, KS 66046-2700.

Dear Mr. Plauger,
I am writing concerning article source code in the January 1998 issue of the C/C++ User's Journal ("Creating Active Data Types via Multithreading," by Patrick Tennberg).
The specific problem is the repeated use of C++ code of the form:
EnterCriticalSection
Do some thread-safe processing
LeaveCriticalSection
This code structure should be avoided whenever the called functions in the "do some processing" section can raise exceptions. Such exceptions will cause the critical section to remain forever locked, blocking all other threads, and locking up the application (if you're lucky) or causing any of a long list of hard-to-debug multithreading problems. This is true for critical sections, mutexes, and most other locking mechanisms.
I think a better protection scheme is to encapsulate the lock/unlock mechanism into the constructor and destructor of a separate C++ class and to create a local variable of that class type in the functions that need to be thread-safe. The presence of the unlocking call in the class's destructor insures that resources are freed during stack unwinding if an exception occurs. The Microsoft MFC class CSingleLock is a good example of such a dedicated data protection class.
I don't know anything about STL, but I can guess that the data structure manipulations in the article's code can (at a minimum) raise memory allocation exceptions. I think C++ programmers must move away from the Lock/Process/Unlock sequence for thread-safe functions if they are going to work in an exception-enabled environment.
I usually would not write in about something like this, but I have been burnt by similar problems, and I think that it is important to pay attention to the details when source code is being published and its conventions may prove influential on the programming community.
I always seem to find something valuable in your magazine which makes it worth the investment — keep up the good work.

Allen Broadman
allenbro@ix.netcom.com

Dear PJP:
As a long time reader of CUJ, I am able to keep informed on the progress of the C++ standard, and am excited about power offered by the STL. Unfortunately, until IBM releases the new version of Visual Age C++, I can't turn excited into code. I am in the early stages of development of a system (dare I say framework) that I expect to be building on for the next several years. I would like to base this early design work on the C++ Standard. Since I hear that IBM plans to use your Standard library implementation, who better to ask for a suggestion? Could you recommend a public domain (or cheap) source for STL that I can use until I can get yours from IBM?

Eric Raaen Hughes STX
eric.raaen@gsfc.nasa.gov

There are lots of free or cheap STL versions out there — a web search will get you to several — but few come close to matching the C++ Standard. Microsoft Visual C++ is your best bet until IBM ships, if you can afford to invest in yet another compiler. — pjp
ObjectSpace provides a free version of STL in its Standards<Toolkit> (also free). Visit http://www.objectspace.com/devtech. — mb

Editor's note: The following letter from Danny Kalev is in response to Jerry Liebelson's letter in the April 1998 issue of CUJ. In that letter, Liebelson was commenting on Kalev's suggested technique for simulating C++ enums in Java (see "Porting a C++ Application to Java," CUJ, February 1998). We regret that we didn't receive Kalev's letter in time to include it in the April issue of CUJ.

Dear Jerry,

I would like to thank you for your suggestions regarding Java Equivalents to C++ enums, which indeed solve some of the drawbacks of a class-based solution as described in my article, such as the enormous number of tiny source as well as .class files required, and type safety. I find your suggestions quite impressive and innovative. I'm afraid, however, that even these variations would have been impractical in our project for the following reasons:
C++ enum types are both very high level in terms of software engineering and very low level in terms of efficiency (i.e., they are very efficient):

the programmer does not have to care about the actual numeric value of an enum nor does he need to specify it on definition; in Java the latter does not hold.

In C++, enums are strongly typed and hence safer to use than mere const ints. In Java only a class-based solution can ensure safety.

C++ enums are easy to maintain: their list of values can be extended without having to manually iterate through each and every value and recalculate it; the compiler takes care of that. In Java, the programmer must specify the enum values manually and recalculate them when extending the list of values.

In terms of performance, C++ enums are very efficient since they are automatically converted by the compiler to plain ints. Furthermore: an enum type in C++ needn't be identical in size to sizeof(int) (as in C). As a result, the compiler may optimize memory usage by squeezing the enum value to units smaller than int, such as short or char. There is also the possibility of storing the enum's value on a machine register, which may considerably increase performance even more. Java, it seems, cannot offer these advantages all at once. The programmer/designer is faced with the choice of compromising either efficiency or security.

Your suggestion reduces significantly the number of classes needed to be defined, but still, an interface + a full-blown class are required for each enum type (or at least one shareable interface + many classes — a class for each enum type). In large software projects, it may sum up to dozens and even hundreds of classes/interfaces, whereas in C++ all you need is a single header file! Moreover, a Java novice experienced with C++ may assume that a tiny class in Java is almost for free, which it definitely isn't: at least two long integers are required even for a tiny class: one serving as a class handle and the other as the class's pointer to the virtual table. (You're probably aware of the fact that Java methods are all virtual by default and so is equals(); Java does not support operator overloading so the programmer cannot overload == instead.)
The sad conclusion is that the preferred solution in most cases still remains a manually enumerated set of constants grouped in an interface, with all of the shortcommings involved. Moreover, I'm not very convinced that an average Java programmer would be glad to define a separate class + interface pair for each enum type (I wouldn't) instead of storing them in a single interface. I can only hope that future extensions of Java yet to come will also include enum types.
And once again, thanks for your interesting email.
Yours sincerely,

Danny Kalev

Leibelson replies:
Hi Danny,
Thanks for responding to my comments. I too regret the lack of enums in Java (and the lack of destructors, default arguments, ...), and yes, from the perspective of efficiency and convenience, there is no way in Java to match the capabilities of C++ enums. Thus the strategies we've been discussing are really analogs and not true equivalents.
I suppose if I were involved in a project such as yours, particularly if there were severe time constraints, I might very well opt for the solution your team chose.
I do have further thoughts, however, in response to some of your comments, which hopefully you won't feel is beating the subject to death:
"In Java, the programmer must specify the enum values manually and recalculate them when extending the list of values."
No question that enums are more convenient in this regard, but I don't follow what you mean by "recalculate." You do have to explicitly assign a new numerical value for each new enumeration but the previous declarations don't have to be touched.
"... still, an interface + a full-blown class are required for each enum type (or at least one shareable interface + many classes — a class for each enum type). In large software projects, it may sum up to dozens and even hundreds of classes/interfaces, whereas in C++ all you need is a single header file!"
Though my example showed an interface for each enum class, one can always just use a single interface containing the enumerated constants for all the enum classes in the constants package. Alternatively, one could group constants for related enum types together and arrive at smaller set of interfaces. One drawback to a single huge interface is that it often requires additional prefixes/suffixes to avoid naming conflicts among the constants.
"Moreover, a Java novice experienced with C++ may assume that a tiny class in Java is almost for free, which it definitely isn't: at least two long integers are required even for a tiny class: one serving as a class handle and the other as the class's pointer to the virtual table ..."
Two thoughts about this and efficiency in general:

1) I don't know for sure, but it's possible that Java compilers might optimize this situation somewhat if the class were declared as final. (In fact I think an enum class should always be declared final anyway.)

2) Though it breaks the encapsulation and opens the potential for abuse, one could add getValue() and equals(int) methods to allow for use in switch statements:
public int getValue()
{ return this.value}
public boolean equals(int other)
{ return this.value == other;}
I think this is tolerable as long as public and protected methods which use the enums still define their formal arguments with the enum class type. Ah, but now there are two more methods to write for each enum class. (See my next comment.)
"Moreover, I'm not very convinced that an average Java programmer would be glad to define a separate class + interface pair for each enum type (I wouldn't) instead of storing them in a single interface."
If it's a just a matter of the tedious effort involved, one could employ a Perl script to generate the class and interface given the class and interface names, an initial value and the set of enumeration names:
makeEnum -class TransactionType
  -interface TransactionTypeValues
-start 1 SELECT INSERT UPDATE ....
Now this is now starting to beat the subject to death! :-)
"I can only hope that future extensions of Java yet to come will also include enum types."
No argument here on that point.-jl

Dear CUJ,
In the [February] 1998 issue of C/C++ Users Journal you published an article called "Decision-Making with Production Systems" by Dwayne Phillips. I downloaded the source code and had been experimenting with the program. I added a function to print the list of hit rules for each step of the program. This is what I got for the first two steps:
Context :
  friday dt6 shift3 wedds1d1-busy
IF dt6 THEN shift-rollover
IF friday AND_NOT weekend THEN
  weekday
IF wed AND shift3 AND_NOT shift1
  THEN weds3
     
Context :
  shift-rollover friday dt6 shift3
  wedds1d1-busy
IF dt6 THEN shift-rollover
IF shift3 AND shift-rollover AND_NOT
  shift2 THEN day-rollover
IF friday AND_NOT weekend THEN
  weekday
IF wed AND shift3 AND_NOT shift1
  THEN weds3
Take a look at the last rule on the list. If you look at the context there is no wed present; it occurs only as part of wedds1d1-busy. The reason this rule is activated is because the strstr() function (which was used in is_a_hit() of the original source) doesn't distinguish between a whole word and a part of a word.
One way to fix this is simply to change anything that has wed in it to something else. Another way would be to look for a space or '\0' following the word and if there is none, not mark it as a hit.
Sincerely yours,

Yakov Shafranovich
yakov@compuserve.com

Thanks for the tip. Knowing Dwayne, I'm sure he'll be glad to hear you've been experimenting with his code, even though you found a bug. — mb

Dear P.J.,
Coincidentally, the week that I made my first real foray into using the STL, I read the letter by Chris Ahlstrom in the February 1998 issue complaining about the lack of readability and comments in the STL code shipping with Microsoft VC++ 5.0. I'd like to second his comments but add one of my own.
I always compile with the warning level on the compiler set to 4 (Microsoft's highest) to give the compiler the best chance of indicating problems with my code. I always fix the warnings so that when warnings appear in the compiler output I know it's associated with the latest changes I've made. I was somewhat dismayed when I added an STL header to clean compiling code and the compiler produced a huge stream of warnings. It turns out that fixing the warnings in the STL headers I was using was simple but I feel I shouldn't have to do this. If component software is to be successful, libraries must engender users with confidence in their quality and robustness. Headers that generate huge flows of warnings don't do it for me.
Thank you,

Tim Craig

Nor for me. On the other hand, I also can't abide compilers that issue warnings for usage that I consider perfectly acceptable. The problem is made worse by the ever wider use of templates, which often generate code for special cases that would be nonsense if hand coded.
I write code that is highly portable. When I feed it to a new compiler, I read all the warning messages. In response to any sensible warnings, I change the code so it gets even more portable and more robust — and avoids the warnings, of course. But every compiler still ends up issuing a slew of silly messages that I have to turn off by hand or just live with. The VC++ headers turn off all the warnings they can, unless you override the pragmas that do so. I don't like it either. — pjp

Dear [Mark Nadelson]
I have read through your article "Real-time Error Processing on a UNIX Network" (CUJ, March 1998) and have to admit that I have missed the point. Aside from being an interesting exercise in socket programming, what is the practical value of this approach?
Most of the BSD4.2+ and SVR4 based systems provide a syslogd facility which does what you are trying to implement and a little bit more (for instance, handle error messages depending on the severity, e.g. file away informational stuff, mail alert messages to the given user(s), send critical errors to the given host for the further processing, or run paging program on the critical errors). Please note that all these message redirection features are independent from your program (as they should be in a production environment). Basically, what you need is to have support folks agree on severity levels of your error messages and the facility you are going to be using to report them, and code away.
Good description of syslogd is given by W. Richard Stevens in Advanced Programming in the Unix Environment [Addison-Wesley, ISBN: 0201563177], but man syslogd should give you enough information to think about. For those of us who are stuck with multithreaded programming, xxx_r versions of the calls are available.
In the case I have missed the significant advantage of your homegrown wheel over syslogd facility, please have my sincere apology.

Alexandre Kovalenko
ADP Inc.

P.S. Opinions are my own and are not necessary shared by my employer.

Mark Nadelson replies:
Dear Mr. Kovalenko,
Thank you very much for reading my article. You are correct in coming to the conclusion that my article does pose an exercise in sockets programming. Part of the purpose of my article was to show how to create a C++ class which takes advantage of UDP sockets. I try to illustrate the drawbacks to UDP sockets, and how to overcome them (via threads/signal processing/and an acknowlegement back for the server).
You are also correct in the fact that you can use syslogd as a means of real-time error handling. There are some drawbacks to syslogd that I believe my approach overcomes. The first is that in /etc/services the port for syslogd is specified as UDP. This is a good approach because the application reporting errors can broadcast the message without syslogd being up and running. The unfortunate side-effect is that if syslogd is not running (for whatever reason) error reporting will be missed. My approach attempts to overcome this by queing and resending error messages unacknowleged by the server.
Another drawback is that you are limited to where the messages can go and what can be encoded within an error message structure. Using syslogd you can only have a single line of error message along with a severity code and you can only send the messages to certain mediums, such as disk, email, etc. Using my approach you can create a error message structure which includes severity, message text (maybe many lines of message text), the process id and host name of process which created the error message, and any other significant information (such as parameter list) which helps get to the bottom of a critical situation. With my approach you can also have the server easily display the information in a GUI, or store the messages directly into a relational database.
The final drawback to syslogd is that syslogd is limited to a UNIX server. Using my approach you can have processes running under Windows NT, as well as processes on Unix servers, reporting and or recieving error messages using the same class. (The code is slightly different on Windows NT in socket initialization and threading.) Due to the fact that more and more networks are becoming a mixture of the two, this is a significant difference.
I hope this clarifies the points in my article. If you have any additional questions please let me know. Thank you again for you interest
Sincerely,
Mark Nadelson
nadelsm@gcm.com

Dear Mr. Plauger:
Since you wrote much of the code for the Standard C++ libraries used in VC++ 5.0, I thought I'd run this one by you for your opinion. Listing 1 shows a code sample that will not compile unless the foo stream insertion operator is written to take a const foo& instead of the usual foo& parameter. This appears to be because of the call to copy which is insisting, somewhere within the templates, on an insertion operator with that signature.
Is this a bug in STL, a bug in Microsoft's implementation thereof, or unavoidable standard operating procedure? No reference I have seen mentions the need for an insertion operator requiring const parameters, although Nelson's book C++ Programmer's Guide to the Standard Template Library mentions an opposite problem of violations of const within STL. It doesn't pose any real problem to do it that way since an ostream insertion would seem usually to be logically const on the object being inserted anyway. But I want to hear it from the horse's mouth.
Thank you.
Sincerely,

Steve Cohen

Template set stores only const elements, so you can't destroy the order of the set by altering a stored value in place. I believe that is the source of the constness that you're running afoul of. On the other hand, it's generally good style and good hygiene to make a reference parameter const if the called function does not alter the stored value, so what the heck. — pjp