September 1994/We Have Mail

Departments

We Have Mail

Dear Bill:
In your July 1994 issue of CUJ, Sean Corfield indicates that we derived our PC-lint for C++ from our very successful PC-lint for C. That's true enough and it's not a secret, but why does this make our product not able to "do justice to the problematic edifice that is C++." Surely it should not be the path taken but the end-result that matters. And if the path taken does matter, why shouldn't the better path be through code reusability? While some of us practice code reuse, evidently others denounce it.
By deriving from the C version of PC-lint, we are able to inherit its extensive checks for C-type coding blemishes, such as unused or uninitialized variables. Certainly most of the warnings our C++ users get are C-type warnings rather than warnings that pertain to C++ alone. Many of our users have mixed suites of C and C++ code which need to be processed together so that any good lint facility must be truly polymorphic with respect to the two languages.
Since Sean Corfield represents a competitor's product (QA C++) perhaps this is a good opportunity to suggest a face-off comparison between the two products (by the proverbial independent laboratory) to see how the two approaches compare.
Sincerely,
James F. Gimpel
President
Gimpel Software
3207 Hogarth Lane
Collegeville, PA 19426
P.J. Plauger:
I was wondering about the info that was in a recent version of The C Users Journal. In the back of the magazine is an article ["Code Capsules: Variable-Length Argument Lists," by Chuck Allison, CUJ, February 1994, p. 103] about adding commas into a numeric string to "format" the output really nicely. Well, I dunno why the user went about it the way he did, but its kinda long. While my version (see Listing 1) might not use numeric (you pass a string containing the number and it returns a formatted string with commas) it's a lot shorter. It's from a program I released a little while ago on Compuserve and internet sites called FREE 3.
This code will work for any number from 1 to 100,000,000,000 or even higher depending on the size of the string Work[nn].
Keep up the good C.
Greg L. Merideth
V.P. MeriTech N.J. Inc.
Montclair N.J.
meritech@delphi.com
gxm1949@sol.essex.edu
Compuserve: 72164.1534
Dear P J.,
I have a complaint about the scanf, fscanf, and sscanf functions. There seems to be no way to partially scan a field, skip any remaining characters, and then move on to the next field. I can do the following:
char array[2][3];
char record[] = "Hello world\n";
/* one of many possible two field records
   with variable length fields */

sscanf(record,"%s %s",array[0],array[1]);
/* result: array[0] and array[1] overflow */

sscanf(record,"%2s %2s",array[0],array[1]);
/* result: array[0] [He\0] and array[1][ll\0] */
Both of the above scans fail to meet my objectives to capture only the first two characters of each field in record and prevent array overflow.
I would like a new format convention to handle the above situation.
sscanf(record,"%2*s %2*s", array[0],array[1]);
/* result: array[0][He\0] and array[1][wo\o] */
This seems like a logical extension of the current format conventions. After all, %*s skips an entire field so %[width]*s should assign [width] characters and then skip (not assign) any remaining characters. Likewise, it would be handy to have %*[width]s skip all but [width] characters at the end of a field.
Of course, now that I have your book The Standard C Library, I can build a custom scanf function, but that misses the point. The standard scanf, fscanf, and sscanf functions should have a practical partial field scanning method!
Perhaps the X3J11, WG14 or whatever C Standards committee could take a look at this.
Thanks for a great magazine, and keep up the good work.
K.J. Halliwell
222 S. Virginia Lee Rd.
Columbus, OH 43209
There's not much love lost within the C Standards committees for the scan family of functions. I don't expect there'll be much support for making them even more complex, given their limited utility in robust programs. You may be better off writing your custom version that does what you want. — pjp
Dr. Plauger:
This rambling letter contains some ideas that you or your readers might find interesting.
1. I encountered a problem when I installed your ldiv function from The Standard C Library into our embedded application. My troubles stemmed from the desire to call ldiv from a background process. Experience has taught me to be careful where reentrancy is concerned, so I looked closely at the assembler code generated by our compiler (see Listing 2) . The left column is the literal source code; the right column is the equivalent of what the compiler generated. Your compiler may vary.
The silent static allocation prevents ldiv, and any other functions that return structures, from being reentrant. This is one gotcha that I'm glad I found before run-time testing started! A superficial review of my reference books failed to turn up any warnings about reentrancy problems — in fact I didn't find the word "reentrancy" at all. Does the implementation described above conform to the ANSI C standard? Does ANSI C make any guarantees about reentrancy? Perhaps the descriptions of the standard library functions could be amended to indicate which functions are required to be reentrant.
2. We have elected to embed an SCCS ID string in each source module such that it will also be present in the executable image. We originally did this by initializing a static volatile const char array to the ID text string. This variable was declared at the top of the module, outside of any functions. However, the compiler noticed that it wasn't used anywhere and optimized it out of existence. I thought the volatile type qualifier was supposed to prevent this sort of thing. We eventually found that a static function that returned a const char pointer to the ID text string was permitted to persist even though it was never called.
3. I learned a principle of medical testing and diagnosis from my wife that I believe is germane to software engineering. Every test can be characterized by its sensitivity and specificity. To define these words in our terms: high sensitivity indicates that most bugs are caught by the test; high specificity indicates that most non-bugs are passed by the test. These are typically competing interests. A test that always fails has 100% sensitivity and 0% specificity; a test that always passes has 100% specificity and 0% sensitivity. The trick is to achieve a useful balance. These are simple concepts, but I find that giving them names has made me more aware of them.
4. You're probably getting tired of hearing everybody's pet ideas on how to improve C; but here's mine anyway. Many of the embedded programs I've worked on would have benefited from a new type qualifier to enforce write-only access to variables (sort of an anti-const). This is mainly because of peripheral devices that overlay read-only and write-only registers. My (nonportable) technique is to overlay the peripheral device with a union of two structures named read and write. The read structure is declared as a const struct; but there is no similar way to enforce proper write-only access to the write structure. You should be permitted to combine const and anti-const in a single declaration to define an inaccessible variable. This would be useful for padding structures.
I look forward to your comments,
Don Bockenfeld
Flight Visions, Inc.
43W752 Route 30
Sugar Grove IL 60554
Items 1 and 2 are arguably compiler bugs. Item 3 is useful nomenclature and item 4 is a useful convention, whether or not it's worth making part of the Standard C language. Thanks. — pjp
Dear editor,
While reading the Chuck Allison's "Code Capsules" column in December 1993 C Users Journal, one feature of the interface to the bitstring class, namely the find member function, struck me as perhaps being a little "goto-ish" in what I would have thought was essentially a collection class. The commonest use of this function I would imagine to be to iterate through the various bit positions of a bitstring and applying some operation. A trivial example here would be extracting all the prime bit positions that are set from one bitstring and inserting those into a new bitstring.
const int MAXSETLENGTH = ...;
bitstring set(0L, MAXSETLENGTH);
...
bitstring primes(0L, set.length());
...
const int ONE = 1;
for (size_t bitnr = 0; bitnr != NPOS;
    bitnr = set.find(ONE, bitnr)) {
    if (isprime(bitnr)) {
        primes.set(bitnr, ONE);
        }
}
It is fairly obvious that this is a findfirst, findnext style of iterator, which I would suggest is particularly vulnerable in a multithreaded environment (e.g. what happens if the bitstring is modified by another thread between one invocation of find and the next?) Of course each client of such a class can explicitly use mutual exclusion facilities (semaphores, monitors) to prevent these problems, but would it not be preferable to at least allow the class to perform such mutual exclusion transparently as part of its implementation?
My suggestion, which allows an implementation this freedom, is to make the iterators essentially atomic, i.e., a single indivisible operation. One possible interface, with inspiration from Smalltalk and Actor, is the following.
typedef void (*VAPPLYFN)(size_t bitnr, void* args);
class bitstring {
public:
    ...
    // For each bit in bitstring that is set(1) invoke
    // applyMe() with that bit number and pass through
    // a pointer to an arbitrary argument block.
    
    void doForSetBits(VAPPLYFN applyMe, void* args)
    ...
};
which can be used to recode the previous example:
const int MAXSETLENGTH = ...;
const int ONE = 1;
bitstring set (0L, MAXSETLENGTH);
...
bitstring primes (0L, set.length ());
...
static void addToPrimesSet (size_t bitnr, void* args) {
    bitstring* primesp = (bitstring*) args;
    if (isprime(bitnr) {
        (*primesp).set (bitnr, ONE);
    }
}
...
set.doForSetBits(addToPrimesSet, &primes);
One imagines that the first thing that doForSetBits might do is to ensure mutual exclusion, which incidently would save the acute embarrassment of the case where the two bitstrings above were actually identical. (This could be detected by recording which thread had the object locked and either throwing an exception if that thread tried locking the object again or implementing a copy-on-write scheme where the a read lock was promoted to a write lock.)
A full implementation along these lines might include member functions something like:
// For each bit in bitstring that is either set(1) or
// clear(0)invoke applyMe() with that bit number and
// pass through a pointer to an arbitrary argument block.

void doForSetBits(VAPPLYFN applyMe, void* args);
void doForClearBits(VAPPLYFN applyMe, void* args);

// For each bit in bitstring that is either set(1) or
// clear(0) invoke applyMe until applyMe() returns a
// non-zero result or the bitstring is exhausted.
// Return zero(0) if the iteration the iteration
// terminated by exhaustion: non-zero otherwise.

typedef bool(*APPLYFN) (size_t bitnr, void* args);
bool doForSetBitsUntilTrue(APPLYFN applyMe, void* args);
bool doForClearBitsUntilTrue(APPLYFN applyMe, void* args);
Clearly if the bitstring class were to benefit from this approach any (more) general collection class would also benefit. A possible interface might be along these lines:
template<class Type> class GenericCollection {
public:
...
void doIt (void (*f)(Type& item, void* args), void* args);
int untilTrue (int (*f)(Type& item, void* args), void* args);
...
};
In any case I think that C demonstrated that the ultimate success of a programming language is due in no small part to the utility of its (standard) library and that those responsible for designing the standard class libraries for C++ may well determine the success or otherwise of the language to a far greater extent than the committee which designs the actual language.
Yours sincerely,
J Sainsbury
3/42 Ryan St
Hill End Q4101
Australia
061 7 844 0285
Your last observation may very well be right. Certainly, quite a few members of the committees standardizing C++ are worrying about just the issues you raise in your examples. — pjp
Dear Mr. Editor,
As a new subscriber to the CUJ, I recently received my first copy of your magazine. As a relative newcomer to the C programming language I was a little disappointed to find the contents focused almost entirely on C++.
I suppose the answer is that all the interesting stuff that can be done in C has already been covered, and is to be found in your back issues. As Bruce Dickey in your letters section says, I know I will have to learn C++ sooner or later (not meant to be an accurate quote), but for the time being, I want it to be later. I mean, I want to thoroughly get the hang of C ordinaire first, and I was kind of hoping your magazine would help me do this.
Yours faithfully
B.R.Oldham
13 Windermere Road
Hucknall, Nottingham
England NG15 6NF
Yeah. We keep striving for an equitable balance between C and C++, but we don't always succeed, certainly not to everyone's taste. — pjp
Dear Editor,
I have enjoyed reading the informative articles in your publication. As you are aware, object-oriented programming has become all the rage in the last several years. I have found the articles by writer Dan Saks helpful in learning about the details of polymorphism, classes, objects, inheritance, operator overloading, constructors, and other features of OOP. However, the big picture with regard to OOP is still unclear in my mind. I ask, "why bother?" What does the computer programmer gain by subjecting himself to the considerable additional mental gymnastics required by OOP? Can you or one of your readers suggest either a text or an article in a periodical that directly addresses this issue? I am an engineer who occasionally writes single-module file format conversion programs.
Many thanks for your assistance.
Yours truly,
Lee Shackelford
At the level of complexity you are currently programming, OOP is of marginal utility. The payoff comes in managing larger programs, and in writing code that has a greater chance of being reused unchanged. Just keep reading and programming. —pjp
Dear Sir:
John W. Ross's article in the April '94 CUJ struck a chord. I wrote a program for compressing and decompressing batches of files, based on routines by Al Stevens in the February '91 and October '92 issues of Dr. Dobbs' Journal. The later versions of his routines improved performance and storage requirements by saving only node pointers above the leaves of the Huffman tree in the output file. I found some other ways to improve performance that may be of interest to other readers.
First, I do away with recursion in encode. Second, I use an array of structures, indexed by ASCII codes, that combines a counter and a bit accumulator. Each counter stores the number of nodes between leaf (character) and root; each accumulator encodes the path through the nodes. encode traverses the Huffman tree at most 256 times (many fewer for text files) to fill the array, rather than once for every character in the input file. It then reads characters from the input file and sends accumulator bits to emit under control of the counter. Performance improved a good bit with that approach.
The enclosed listing (Listing 3) is my version of the two routines in question, with their names changed to match those of Ross's routines. (DWORD means unsigned long) The most important change, though, was allocating a 32 Kb output buffer. The Turbo C fwrite routine failed miserably when used with a buffer; it didn't report an error when writing to a full disk, instead routing output straight to the dumper. I don't know whether that is normal for fwrite or a fault in the implementation.
To obtain timely notification of all errors, an earlier version of the program used no output buffering at all, writing code bytes directly to the compressed file. You won't be surprised to hear that it limped along until I shot it. Maintaining my own buffer also lets me use _write, which devolves into little more than a DOS call. Tracking the buffer status and the amount of free space remaining on the disk is more work, but is amply repaid in improved performance.
Though Huffman compression is not at its best when applied to EXE files, I've gotten good results with other binary types, such as database and spreadsheet files. Those who use Ross's programs to compress records that may contain ASCII nulls should be aware of a subtle bug in the listings: encode uses a child node of zero as a terminal condition, but that node should represent ASCII null bytes in the input file. Furthermore, bldtree skips the ASCII null leaf altogether. The fix is to change bldtree to:
initialize entire htree to -1
set h1 and h2 to -1 at the top of the
    while(1) loop
use "if (h2 == -1)" to determine the root
    node and to change encode() to test for
    "ht[h].parent != -1"
eliminate "if (child)"
Finally, I don't trust Mr. Ross's method because it requires that auxiliary files hold the Huffman tree and the record indexes be created for each compressed file. Those files should be included when calculating storage requirements for the compressed file; worse, decompression becomes impossible if either of them gets damaged or erased. I wonder whether dynamically compressing and decompressing individual records is fast enough to recommend it over working with an uncompressed file, compressing it only for archiving. Compressing an entire file makes it possible to store the Huffman tree in the compressed file and makes an index file unnecessary; delays imposed by associated file operations occur only at the beginning and end of the work session. Sincerely,
Richard Zigler
PO Box 152
McBain, MI 49657-0152
Dear Sir:
The technique Matt Weisfeld describes in the April '94 issue, vectoring Windows messages through pointers-to-function returned from a search routine as an alternative to large switch statements, is very useful. Variations on this theme have been independently rediscovered by many developers, and it or something like it is now the preferred technique for message handling in Windows programming. Only the inertia of existing code and Petzold's Windows "bible" keeps the super-switch statement in business.
When compiling a switch statement, the Microsoft compilers have always generated an efficient indirect branch through a jump table, but only when the case values are contiguous (that is, an unbroken range of integers). If so much as one value leaves a gap in the range, the compiler reverts to a lookup table. Last time I looked, this table was generated in the order in which the case values appear in the source code, and is scanned sequentially. So the effective code is much the same as Mr. Weisfeld's, though I agree with him that separate functions are more easily read, understood and maintained by humans.
Another advantage that function vectoring has over a switch is that the vector table can be modified at run-time: you can add or remove vectors as the program's state requires. You can keep two or more vector tables and activate one or another at different places in the program — in effect, you can choose among multiple "window procedures" at different times.
Two comments on his sidebar, "Choosing a Table Search Algorithm." First, binary search is not always the optimum algorithm for repeated lookup, if you can predict the content of the message stream. As Weisfeld points out, a binary search takes nearly log2n steps to find the "average"message. In Windows work, WM_COMMAND messages will come along much more often than, say, WM_COMPACTING. If WM_COMMAND is placed at the head of a linear search table, the search algorithm will hit it immediately, where a binary search would have to do log2n steps.
Arranging the vector table for best hit rate would typically be done in source code, but you could also put code in the program to monitor the number of hits on each entry and occasionally re-sort the table so that the most commonly-used messages bubble up to the head of the list. (You might want to do this while developing the program, and then put a static table in the production release sorted from the early experience.)
Second, if a message is not in your table (and should be handled by DefwindowProc), both linear and binary searches must go through their entire algorithm (n and log2n steps, respectively) before they can announce failure and return the DefWindowProc vector. The trick is to find a cheap, fast test that quickly eliminates (most) messages not in your table, and only bothers with the table search if there's a good chance the message is there. Finding that cheap, fast test for an arbitrary set of integer values is a knotty problem I haven't solved yet. If anyone knows how, I'd give a pretty penny to hear about it!
Sincerely,
Davidson Corry
1404 SW 126th St.
Seattle, WA 98146
Dear Dr. Plauger,
I was puzzled by Matt Weisfeld's article, "An Alternative to Large Switch Statements" (April '94 CUJ). Matt's argument for lookup tables over the switch statement is based on his statements that his approach is "more manageable" and "provides a code savings." I don't find either statement to be true.
His actual example uses many more lines of code for function prototypes, the table itself, and the lookup code, than the switch statement in Listing 3. Further and more important, it is harder for someone reading his code to find and understand what response is given in a given case.
If we use his function prototypes and split the behavior in the various cases into functions, the resulting switch statement becomes both manageable and shorter than his table mechanism:
case WM_COMMAND:
{
switch(wParam) {
    case WM_FILE_NEW:        p_file_new();        break;
    case WM_FILE_OPEN:       p_file_open();       break;
    case WM_FILE_SAVE:       p_file_save();       break;
    case WM_FILE_SAVE_AS:    p_file_save_as();    break;
    . . . etc.
Now the switch statement is the same size as his table, and there is no overhead of search code, structure definition, etc. It is obvious to someone reading the code what is taking place. Certainly (in most realistic cases) creating separate functions for each case rather than having them embedded in the switch statement does add readability, but it is this, rather than the table, which provides the benefit.
The statement that "You can also speed up table searches by using smarter algorithms" is also irrelevant to most Windows programs. Windows switch statements rarely have more than 30 or 40 cases, and the performance improvement gained by using a binary search rather than linear would be inconsequential at best.
In sum, I feel that recasting the switch statement above into a table is a net loss on all fronts, not a gain.
Sincerely,
Wahhab Baldwin
15127 NE 24th St. #129
Redmond, WA 98052
Obviously, opinions vary. — pjp
Dear Mr. Plauger,
Though I am a new reader of C Users Journal, I am highly impressed by the quality and efforts you put in that to serve C/C++ users. My friend (who actually told me about C Users Journal) was telling me yesterday after looking at code for "Variable-Length Argument List" that he spent a full night to write this code for his project and his comments about Chuck Allison's code in the February 1994 issue is that it is far more efficient and smaller than what he wrote.
Anyway, now I am a permanent subscriber to the magazine and will be benefiting a lot from it.
As I am new to the magazine, I don't know what was already printed recently about C/C++ in C User Journal. I did a check on Compuserve but, since there was no topic index on a monthly basis, I could not download anything. Actually I am interested in any code/article about hybrid syntax-directed (screen-oriented) editors for editing C/C++ programs.
I will appreciate if you can help me in that regard and respond to me via email.
Thanks a lot.
Gulzar Mohammad
71042.2751 @ compuserve.com
mohammad%eee1.dnet.@ed8200.ped.pto.ford.com
[A machine-readable index is available from R&D Publications. Call (913) 841-1631, or write to R&D Publications, 1601 W. 23rd St., Lawrence, KS 66046. FAX: (913) 841-2624. e-mail: cujsub@rdpub.com — mb]
Mr. Plauger: I wanted to comment about Bruce Dickey's letter. I agree about the quality of the compilers that have been available. A recent program I wrote over the last two weeks was about 9,000 lines (including comments) and the final tally was four specification errors, one programming error on my part, and seven compiler errors which required work-arounds. During a much larger previous project our record was 18 compiler errors in a single week (but we had some almost every week — I averaged about one bug per month (gamma level code, in both cases)).
My experience has been that, except for the JPI compiler, all of the DOS and Windows compilers by virtually all of the vendors (we may have missed testing one or two, but we tested most on the market recently) are bug ridden and simply fail to work one way or another. Unfortunately, the JPI compiler may be history since Clarion took it over. Nothing new has emerged in quite some time. Most of these DOS/Windows compilers will not run correctly under OS/2 — a major problem for many programming shops who use OS/2 as their basic development platform.
All of the compilers except for the Microsoft version 8.00 compiler failed to handle the huge model correctly, and most would not allow us to select the model from the command line. Our organization has never written a program using any other model — they are simply too restricting. But when we moved to the 32bit OS/2 (and hopefully NT compilers, but we don't support NT so I can't say for those), the quality improves drastically. It appears that the effort to support those different segmentation models has repercussions thoughout the entire compiler — quite possibly taking effort away from ensuring correctness to quintupling the size of the code generator to handle all of the different memory models. However, as to Bruce's statement that "C and C++ violate the accepted guidelines of language design," I must take strong exception. I totally disagree that there should only be one way to accomplish something in a programming language. Wirth is responsible for this absurdity (and for several others). I completely refuse to use any language designed by Wirth, he just doesn't know what he is doing. Let me give you a concrete example. I was surprised a couple of years ago to find that a "top-notch" consultant was unable to handle the equivalence
not (A and B) == (not A) or (not B)
I have since made that a part of my interview questions. In all of the interviews I have conducted since then — people I don't see until after obvious bozos have been eliminated — less than 10% have been able to handle negation. And most of those were mathematicans by training, only one or two individuals with a computer science degree could handle the problem (which I had previously thought was as trivial as breathing and never gave it a second thought in programs). And only one individual in my experience had suffient training in abstract thinking (such as predicate logic, abstract algebra, or pure set theory) to even be allowed anywhere near class design for O-O applications (and he was one of our Smalltalk contractors, so none of the interview candidates have had the training needed). I have designed a file called standard.h[pp] over the last decade or so (and placed it in the public domain, but it is not widely distributed at this time). It is dedicated to portability and to "fixing" the C/C++ language as far as possible. It almost always takes the opposite approach to that taken by Wirth, but it has been very successful. Two sets of definitions contained therein are...
#define When(x)    if(x)
#define Unless(x)  if(!(x))

#define While(x)   while(x)
#define Unless(x)  unless(!(x))
The fundamental assumption here (for the first set) is that these "control structures" are not used whenever an else clause is required. But, the control structures are duals and "hide" the negations involved in the semantics of the keywords. Even untrained people have no difficulty in understanding what these do, but if presented in the "basic" form even many experienced programmers stumble. Thus increasing semantic richness leads to more understandable programs and fewer bugs (in my experience). The essential point is that semantic richness and variation is much more important than minimalism when it comes to designing control structures. While Dijkstra has shown that only loops/branching is needed to implement any program, and Smalltalk has shown that only polymorphic message passing is required (by implementing the former using the latter), others have shown that some programs will grow exponentially in size for any fixed set of control structures.
My study of control structures has led me to identify nearly 50 control structures used in programming (admittedly some are rare, and some I have only seen in assembly language). Most of these are not directly supported by any existing language (yet), but it is also true that it is possible to eliminate some of these control structures by expanding code and using other control structures. But, in programming, as in mathematics, notation is everything. A minimal notation is not adequate for good programming. A concise notation is as much an aid to understanding as simplicity — and often times more so. Complexity, many control structures or many operators, does not automatically make programs easier to read and write. However, properly applied, increased semantic richness with a high level of internal consistency can make programs easier to read and write compared to a "minimal" language. What's even worse, is that if all of these variants of control structures are not taught to computer science students, then their thinking and programming is restricted by their mental language map — I have seen programmers drop into assembly language to implement a finite-state machine not because they couldn't do it efficiently in Pascal (or C) but because they didn't even know what one was. Another time I was hired by one company and two programmers had been working on an "insoluable" problem for three months. I completed the program in a couple of weeks — they simply didn't know what a stack was! Semantic richness is a much better design principle for languages than minimalism — but generally it can't be done by committee or you simply wind up with an unbearably complex language that offers no benefit (like Ada).
Michael Lee Finney
71573.1075@compuserve.com
An inevitable consequence of rapid growth in a field is turmoil. I am not the least surprised that compilers, and programmers, vary widely in quality these days. The interesting thing to observe is that even buggy compilers and illtrained programmers can earn good money, in a market with a broad enough mix of needs. — pjp
P.J. Plauger:
I have been reading C Users Journal for some time. It's a great place to get to the meat of C and C++ programming.
I have finally bitten the bullet and am taking formal classes in C programming since I didn't have the discipline (or cooperation of family) to do it on my own. I now have needs in C tools I overlooked when getting my lowly PowerC compiler by Mix software. I find I need a fair-to-good public-domain cross-reference utility and, if possible, a public-domain Lint-like utility. Why public domain? Because until I learn C well enough to subcontract work, I have to buy all my tools and compilers myself and I'm a little short on cash right now.
Could your readers help me out on where to find good stuff which is less expensive than my wife's last overcoat?
Phil Burke
TDE&C/CG&L
401 Church St., 8th Fl.
Nashville, Tennessee 37243-1533
I trust you know about the treasure trove of software to be found in the C Users Group shareware collection. Look for ads near the center of this magazine. — pjp
[A free catalog is available from R&D Publications — mb]
Dear Mr. Plauger:
I am trying to write some utilities which display several statistics about the platform they are running under. I recently tried unsuccessfully to get information about the Personal Computer CPU. I am trying to find out if the computer is an 80286, 80386, 80486... and am using the following code fragment. This code is telling me that all platforms I run it under are model 0xFC, submodel 0xl, which is a PC/AT, PC/XT-286, or a PS/2 Model 50 or 60.
#if (defined (__MSDOS__))
//
//             Model
//       F8H  PS/2 Models 70 and 80
//       F9H  PC Convertible
//       FAH  PS/2 Model 30
//       FBH  PC/XT (later models)
//       FCH  PC/AT, PC/XT-286, PS/2 Models 50&60
//       FDH  PCjr
//       FEH  PC/XT (early models)
//       FFH  PC "Classic"
   union REGS regs;
   struct SREGS segregs;
   char SysModel;
   char SubModel;
   char BIOSrev;
   char ConfigFlags;
   
   regs.h.ah = 0xc0;
   int86x(0x15, &regs, &regs, &segregs);
   if (peekb(segregs.es, regs.x.bx) < 6)
      sprintf(Machine,"Unknown");
   else
   {
      SysModel = peekb(segregs.es, regs.x.bx+2);
      SubModel = peekb(segregs.es, regs.x.bx+3);
      BIOSrev = peekb(segregs.es, regs.x.bx+4);
      ConfigFlags = peekb(segregs.es, regs.x.bx+5);
      printf("SysModel=%x SubModel=%x BIOSrev=%x
            ConfigFlags=%x\n", SysModel, SubModel,
            BIOSrev, ConfigFlags);
      SysModel = peekb(0xf000, 0xfffe);
      printf("f000:fffe emits model %x\n",SysModel);
   }
#else .....
Mark Pumphrey
GTE Federal Systems
5000 Conference Center Drive
Chantilly, VA 22021-3808
pumphrey@europa.eng.gtefsd.com
This is certainly not my fort Anybody? — pjp
Mr Plauger:
I wholeheartedly agree with your editorial (CUJ, Feb. '94) regarding the need to revise the C Standard as soon as possible. I have used C++ for a few years, and I hate its complexity. I keep returning to Turbo Pascal for little programs. Mastering C++ is almost impossible, unless one gets paid to do so.
I believe the C++ standardization committee made many mistakes including everything for everyone in the language. I would have liked better a stronger OOP C than the mess templates, exceptions, and multiple inheritance gave us. Maybe a good language architect will take C++ to get rid of all the fat, and give us C with OOP.
I think the new C should incorporate few OOP constructs. Turbo Pascal can be used as a reference on what to leave in: inheritance, virtual functions, constructors/destructors. C++ code could be examined to see which "features" really get used. Will we have a real string type? Maybe somebody will come up with the right way to add it to C.
C with OOP. What a strong contender for C++. I would switch immediately. No question about that.
P.S. God forbid the garbage collector!
Adolfo DiMare
adimare@ucrvm2.ucr.ac.cr
Opinions, and feelings, are pretty strong about what should be in C and C++. On the basis of many years experience, I can say with certainty that each committee does its level best to satisfy a host of conflicting requirements the best way it knows how at the time. I can say with equal certainty that revised Standard C will aim for a different balance than Standard C++. People will argue for years afterward about which answer is more nearly correct. — pjp
Mr. Plauger,
Do you have any suggestions on resources (such as concepts, techniques, articles, source code, or other people I might contact) that could help me learn more about EXE encryption/protection techniques? This might also include techniques on defeating reverse engineering, using timer interrupts, etc. I've tried every source on Compuserve or Internet that I could find, to no avail. I've also searched CDROMs and library articles but have found no information.
I'm aware of the products such as PKLITE or PROTECT which use EXE compression/encryption techniques, but I want to learn more about how it is actually done and how I might implement something similar.
Thanks...
Skip Moon
Moon Microsystems
1920 Gunbarrel Rd Ste 1014
Chattanooga TN 37421-3169
Compuserve: 73377,710
Anybody ? — pjp
Dear Mr. Plauger,
I read C. Justin Seiferth's letter (CUJ, Feb. '94) concerning errors and/or misprints in source code, lack of installation instructions, no indication of what environment is needed to compile the code (make files, directory configuration, etc.)
I've also had the same problems with software packages and books. I won't mention names either, but a package I recently received and tried to compile couldn't find several include files until I copied them to the base directory for my compiler. During the link, some libraries (COS.LIB, EMU.LIB, MATHS.LIB) could not be found until moved.
Apparently, some developers use their favorite editor and an environment setup for "making" their applications, while others use the integrated development environment (IDE) that comes with their compiler. When they send or sell their software, they assume everyone else is using the same setup they use.
I've also purchased several books on developing applications in C, and when trying to enter and compile the examples, errors of various kinds occur due to errors and/or omissions in the published code. Is the some way to have the publisher/authors supply an addendum for the correct listing without having to buy the source code on disk provided by the publisher/authors?
Thank you,
Ray Hansen
Compuserve: 72261,2104
Brian Kernighan created his famous "Hello, world" program to deal with just these issues. It makes you focus on all the details of getting a compile and link in a new environment, without having to worry about complexities in the code proper until later. Even when you try hard to make a portable package, it's easy to overlook many presumptions. And most people have never been motivated to try hard. There are certainly few, if any, standards for a program development environment. Until such time as they come along, you'll have to deal with each publisher, author, or vendor on a case-by-case basis. — pjp