August 1995/We Have Mail

Departments

We Have Mail

Mr. Plauger,
I have greatly appreciated your articles in CUJ and the information they offer. I've been a professional programmer for almost a year now after graduating from UCLA and am always looking for more information about ANSI C. My employer prefers good ole K&R because C++ isn't standardized yet. What I have in my library right now is the book which you co-authored with Jim Brodie: Programmers Quick Reference Series: Standard C.
What I was wondering is if there are any other books you have written which go in depth into explaining the functions and libraries of Standard C? I mean, using <signal.h> and <errno.h> would never have occurred to me had it not been for the recent series of articles by Chuck Allison.
I guess what I am saying, Mr. Plauger, is that if I knew the Standard C functions like the back of my hand I would be a much more productive asset to my company. Any help from an experienced person like yourself would be tremendously appreciated.
Thanks,
Ken Onweller
kender @netcom.com
See my book, The Standard C Library, Prentice-Hall 1992. — pjp
Greetings Mr. Plauger,
In the March 1995 issue of CUJ, Edward B. Bell writes:
"I have an application for which I would like to generate random English pronounceable proper names. I have looked in the common sources (Knuth, Sedgewick, Wirth, etc.) for a suitable algorithm, but haven't had much luck. Have you ever come across such an algorithm or one that could be adopted?"
Basically, the idea is that from a given database of English proper names, strong statistical regularities can be derived from the distribution of names and symbols making up those names. Correlations between successive symbols in the database will become evident in the frequencies of symbol sequences.
Choose a substring length n and calculate the frequency distribution of each substring (n-gram) within the given database. The frequencies of the n-grams can be used to construct order n-1 models of the database, where the first n-1 symbols of an n-gram are used to predict the nth character. From this model, random text could be generated which reflects the entropy of the model, while maintaining the intra-word dependencies of the various substrings used to generate your original frequency distribution.
Some examples, taken from chapter 4 of Bell, Cleary, and Witten's Text Compression (Prentice-Hall 1990), which use text from a collection of American English text known as the Brown Corpus:

order-0: fsn'iaad ir lntns hynci,.aais oayimh t n , at oeotc fheotyi t afrtgt oidts0, wrr thraeoe order-1: ne h. Evedicusemes Joul itho antes aceravadimpacalagimoffie ff tineng arls, order-2: he ind wory. Latin, und pow". I hinced Newhe nit hiske by re atious opeculbouily order-5: number diness, and it also light of still try and among Presidential discussion is order-11: papal pronouncements to the appeal, said that he'd left the lighter fluid, ha, ha"?
For a more thorough discussion of the above examples, please refer to the above text. It is quite extraordinary.
James Scott
Micom Communications
jscott@micom.com
It certainly looks extraordinary, especially that amusing bit about the Pope and the lighter fluid. Thanks. — pjp
Dear Editor,
I read Brian Tagg's letter (April 1995) about the numeric conversion constants. In my company, we had the same problem, i.e. the multiplication of conversion constants:

#define METERS_TO_FEET 16.404 #define METERS_TO_YARDS 1.094 #define FEET_TO_METERS 0.305
Only to convert inches, feet, yards, miles, meters, centimeters and kilometers units between them, we need 42 conversion constants. Instead, I developed a powerful 50-line package, called SI_pack ("SI" stands for "Systeme International," the metric system). One of my preoccupations was the easiness of use. For example, to convert five meters to yards, the syntax must be as simple as:

nb_yards = convert(5, METERS to YARDS)
Implementation: We need to declare only three basic operators, plus a certain number of physical constants:

/* Basic operators */ #define convert(value,unit) value*((unit)) #define to )/( #define square(unit) (unit*unit) /* distance constants */ #define METERS (1.0) #define INCHES (0.0254) #define FEET (0.3048) #define YARDS (0.9144) #define MILES (1609.34) /* SI (Systeme international) constants */ #define GIGA (1.0e+09) #define MEGA (1.0e+06) #define KILO (1.0e+03) #define HECTO (100.0) #define DECA (10.0) #define DECI (0.1) #define CENTI (0.01) #define MILLI (1.0e-03) #define MICRO (1.0e-06) #define NANO (1.0e-09)
Syntax:

(float) convert( <value_in_unit_1>, <unit1> to <unit2> )
where <value_in_unit_1> can be either a constant or a variable. <unit1> and <unit2> are physical constants previously defined. The result is always a floating-point type (float or double, according to the type of <value_in_unit_1>). METERS is 1.0 instead of 1 to avoid having convert return an integer. Centimeters is written as CENTI*METERS, Kilometers as KILO*METERS, etc.
Readability: The best way to understand how it works is by examples. See for yourself how readable the code can be using SI_pack:

/* Defining sport constants (in feet) */ Baseball_distance_between_bases = 90.0; American_football_field_length = convert(100,YARDS to FEET); Canadian_football_field_length = convert(110,YARDS to FEET); Competitive_swimming_course_len = convert(50,METERS to FEET); /* Statistics about American football */ len = American_football_field_length; printf("%f meters\n", convert(len,FEET to METERS)); printf("%f centimeters\n", convert(len,FEET to CENTI*METERS)); printf("%f kilometers\n", convert(len,FEET to KILO*METERS));
Improvements:

/* Add superficies constants */ #define ACRES (square(MILES)/640.0) /* Add time constants */ #define SECONDS (1.0) #define MINUTES (60.0) #define HOURS (3600.0) /* Add weight constants */ #define GRAMS (1.0) #define OUNCES (0.035) #define POUNDS (453.592) /* Add liquid measure constants */ #define LITERS (1.0) #define PINTS (0.474) #define GALLONS (3.788) /* Other physical constants */ #define NEWTONS (1.0) #define HORSEPOWERS (550*FEET*POUNDS/SECONDS) #define JOULES (1.0) #define CALORIES (4.185)
Power of Use: We can combine physical units to get other units. For example, we can combine distance and time units to get speed units:

/* speeds are in miles/hour */ speed_limit_in_NewYork = 55; speed_limit_in_Vermont = 65; speed_limit_in_Quebec = convert(100, KILO*METERS/HOURS to MILES/HOURS); horse_speed = convert(20, METERS/SECONDS to MILES/HOURS); /* Compare speeds in the same unit */ if (horse_speed > speed_limit_in_Quebec) horse_get_a_fine(); /* get kinetic energy of a 10-horsepower engine moving at 55 mph */ rate_of_work = convert(10, HORSEPOWERS to NEWTONS/SECONDS) energy_in_joules = rate_of_work * convert(55, MILES/HOURS to METERS/SECONDS); /* get surface in square yards */ nb_square_yards = convert(nb_acres, ACRES to square(YARDS));
Efficiency: In the last example, the line is expanded in:

nb_square_yards = convert(nb_acres, ACRES to square(YARDS)); nb_square_yards = nb_acres * ((ACRES to square(YARDS))); nb_square_yards = nb_acres * ((ACRES) / (square(YARDS))); nb_square_yards = nb_acres * (((MILES*MILES)/640.0) / (YARDS*YARDS)); nb_square_yards = nb_acres * ((4046.87) / (0.836127)); nb_square_yards = nb_acres * (4839.96);
The parentheses ensure constant reduction by the compiler. The whole work in fact is done by the preprocessor and the compiler. At run time even a complex expression is equivalent to a multiplication by a constant. This SI_pack.h is not only easy to read, it is also extremely efficient.
Marco Savard
Computer Scientist
(savard@cae.ca)
Cute. — pjp
Dear Editor,
I would like to develop one of the points raised by Brian Tagg (Letters CUJ, April 1995). Tagg mentions loops of the following kind:

fgets(string,MAXLEN,stdin); while ( string[0] == '\n') { puts("Entering a blank line won't cut it.\n"); fgets(string,MAXLEN,stdin); }
As he correctly observes, the problem (if you believe there is one) lies in the repetition of the fgets call. A number of possible solutions were mentioned in the magazine. All of them have a serious deficiency. They damage the underlying structure. The clearest evidence of this damage is in Plauger's comment, "I always check to see if it returns a null pointer and I worry about whether it truncates a long line." If we add a place holder for that extra code then the original example becomes:

fgets(string,MAXLEN,stdin); /* Checking code */ while ( string[0] == '\n') { puts("Entering a blank line won't cut it.\n"); fgets(string,MAXLEN,stdin); /* Checking code */ }
The problem is now shown in its true form as the repetition of both the fgets call and the error checking code. None of the solutions proposed in the magazine is up to handling this.
For those sufficiently determined to remove this redundancy there is a solution. This kind of situation is described by Jackson (Michael Jackson, Principles of Program Design, ISBN 0123790506, Chapter 12, page 262) as the Common Action Tail. His solution, which is completely general, is:

cat_01: fgets(string,MAXLEN,stdin); /* Checking code. */ while ( string[0] == '\n') { puts("Entering a blank line won't cut it.\n"); goto cat_01 ; }
which executes in precisely the same way as the original. This suggestion will raise screams of protest because it contains the dreaded goto statement. Many will suggest that this robs it of that mystical property "structure." This is not true. The structure lies in the original text. The modification is an optimization to that structure intended to achieve four objectives. I have ordered them in my own priority sequence:

Implement the same function.

Be obvious.

Eliminate a possible future maintenance problem.

Reduce the size of the program.
It is not normal practice to use the goto statement for standard flow control purposes. Therefore it seems sensible to make use of this programming resource to scream out at the reader. The message is, "Something mildly unusual is happening here but do not worry, it is all under control." When I use the construct, I also choose to use the "cat" prefix to the label as a hint that the label and the goto implement a common action tail and nothing else. The same implementation may be used in any circumstance where, to quote Jackson, "We may be able to identify points in the text for which subsequent processing is identical although previous processing may have differed." The simplest illustration is in the switch statement:

switch(a) { case 1: /* Reads one line. */ x++ ; fgets(string,MAXLEN,stdin); /* Checking code. */ break ; case 2: /* Reads two lines. */ y ++ ; fgets(string,MAXLEN,stdin); /* Checking code. */ fgets(string,MAXLEN,stdin); /* Checking code. */ break ; case 3: /* Reads three lines. */ z ++ ; fgets(string,MAXLEN,stdin); /* Checking code. */ fgets(string,MAXLEN,stdin); /* Checking code. */ fgets(string,MAXLEN,stdin); /* Checking code. */ break ; }
where we may eliminate some repetition of the fgets calls, and the checking code, by transforming the code to:

ch(a) { case 1: /* Reads one line. */ x++ ; goto cat_01 ; case 2: /* Reads two lines. */ y ++ ; goto cat_02 ; case 3: /* Reads three lines. */
z ++ ; fgets(string,MAXLEN,stdin); /* Checking code. */ cat_02 : fgets(string,MAXLEN,stdin); /* Checking code. */ cat_01 : fgets(string,MAXLEN,stdin); /* Checking code. */ break ; }
I look forward to the views of other programmers.
Yours faithfully,
Richard Howells
I don't have a problem with the structure you end up with, just the way it's expressed. Having an unconditional jump out of a loop makes a lie out of the loop, violating your second and third principles. So I keep rewriting code, often half a dozen times or more, until it is unequivocably clear and correct. In this particular case, that would probably lead me to introduce a function that combines the reading and checking. I would then find that conventional C control flow neatly expresses the original problem. — pjp
Dear Editor,
This is a response to Juan Carlos Flores's letter asking for a way to create a modal dialog where the size, position, and controls of the dialog are read from a file and dynamically created (instead of being created from a template).
I don't think you can create a modal dialog without a resource template in Visual C++. But the dynamic modal dialog creation as explained above can be acheived by creating a modal dialog with an empty resource template (a resource template without any controls) and manipulating it in the OnInitDialog function which is called by Windows before the dialog is shown on the screen. In the OnInitDialog function the values of the position and size of the dialog can be read from the text file and set using the function SetWindowPos. The field definitions can be read from the text file and the controls can be constructed dynamically using operator new and created using the Create member function. You can obtain a pointer to the dynamically created controls by using the GetDlgItem function and providing the ID (used while creating the control) as argument.
Shyam Cherlopalle
pp002312@interramp.com
Thanks. You remind me yet again why I don't want to become a Windows expert. — pjp
Dear Mr. Plauger:
I work for a small engineering firm. Part of my job is writing software. In the past this software was used in-house only, but now we're starting to sell systems where the software is an important part. Our customers by a large majority are running IBM-PC compatibles and want Windows software. So what's the best path for learning Windows programming?
My background in programming includes Basic, Fortran, C, Pascal, and assembly. I've dabbled with Turbo C/C++ for Windows, Microsoft Visual C/C++, and Visual Basic, but haven't spent enough time with either one to become fluent. Other members of my firm are studying Delphi, VB, Borland C++, etc. but there is no clear corporate direction.
Do you or any of your readers have suggestions? Ease of learning is important, but power is also important — Visual Basic is rather crippled from our point of view.
Thanks,
Robert L. Smith
I'm hardly the person to answer your queries, but I fully expect many of our readers can. — pjp
Mr Plauger,
I picked up The C/C++ Users Journal, December 1994 issue, for the first time recently and read with interest Gregory Colvin's article on "Emulating C++ Exception Handling." I have used Borland C++ v3.1 (which does not have exception handling) for a couple of years now, and have begun a project using MS Visual C++ version 1.0 (which does). It would seem that C++ exception handling offers many advantages over other methods of dealing with low-level errors.
However, Mr. Colvin's method of emulating exception handling does not seem to follow the same model as that used by Visual C++. (I am not familiar with any standards that may exist in this area.) Mr. Colvin's code automatically deletes objects created with the current exception handler.
In Visual C++, however, no automatic object deletion is provided — this is left to the exception-catching code. A process that throws an exception is expected to clean up its own data before throwing. A process that calls a function that might throw an exception must clean up its own variables in its catch clause, and can then optionally throw the exception on, or handle it directly. This method seems more flexible while being simpler to implement.
I am hoping to emulate exception handling in Borland C++ v3.1, and would appreciate any comment on this. Is my understanding of the subject flawed?
Thank you for an excellent magazine; you have gained a regular reader.
Igor Siemienowicz
igor@novell.adacel.com.au
There are no standards for emulating exceptions, sadly. You can do it too many different ways, none of which is completely satisfactory. That, of course, is one of the reasons for putting such machinery directly into the language. — pjp