May 1995/We Have Mail

Departments

We Have Mail

Editor,
In the "We Have Mail" section of the November 1994 issue of CUJ Jim Pazarena asks for an article on command-line parsing. Like Jim I get the impression that most magazines, hardly able to digest and report on all the hype and goodies OOP, C++, standardizing committees, and other developing areas have to offer, forget about the ever-existing bunch of newcomers to the subject of their original outset, in this case the good old plain C language. Let me try to fill in the gap with a description of the command-line processing module we use in our software-development department. As in most cases it was written out of a need to standardize the way programmers each on their own "invented" and parsed command-line arguments.
First we had to agree to a standard "look and feel" for our command-line arguments. Nothing new was invented here. We only had to define the allowable subset of all possible styles. Command-line arguments should be position independent, that is a program shouldn't force the order in which arguments should be given at the command line. So all arguments have to be identified by a name. As we develop our programs for both VMS and UNIX platforms, the allowable argument prefixes were a slash (/) and a dash (—). Though most UNIX command-line arguments are made up of a single character, we chose the more descriptive style of VMS. (Also on UNIX you'll see various commands that need more descriptive arguments, like find . -name args. c -print).
To prevent the user from having to type superfluous characters, the programmer may help the user by allowing shortcuts. If a viable argument would be /verbose, the programmer may indicate that his program understands /ver as well. You do so by defining the argument as /ver*bose, which means that the characters after the asterisk (*) are optional.
Arguments that need an additional variable, e.g. filenames and dates, are separated from the argument by the equals (=) character, as in /input:text.lis.
The UNIX convention of concatenating multiple one-character switches as in tar -xvf filename wasn't adopted, mainly because of the possible conflicts with other argument names. On most serious BBS's various flavors of getopt.c can be found which handle these type of command-line arguments. Lastly, the argument parsing should be case-insensitive.
Along with the code for parsing arguments [available on the monthly code disk — pjp], I supply a typical program that expects two mandatory arguments, /r*eal and /t*ext, and three additional arguments /i*nteger, /date, and /ver*bose. The program starts with a call to either ArgInit or ArgInitPrgm. The arguments to main, argc, and argv are passed as well as the number of mandatory arguments. ArgInitPrgm also expects a pointer to a FILE_t structure in which the various parts of argv[0], the fully qualified program name, are stored for any further processing (like determining from which directory the program originates). The ArgInit functions do some preliminary checking and set up some internal variables and structures.
The command-line arguments are then handled by calls to ArgGet, passing the name of the argument, the type of the argument's variable, and an optional buffer in which the value of the argument's variable is to be stored. If the argument is of type ARG_DATE, the third parameter should be a format string specifying the expected date format of the argument's variable. These formats and the accompanying date functions are part of another library which might be the subject of an upcoming article in which all discussions about dates, week-of-day algorithms, weeknumber algorithms (sorry PJP, yours is terribly wrong, try 01-01-1993 which should yield 53!), zellers-congruence, and the like can be wrapped up.
If anything goes wrong, the ArgInit and ArgGet functions set an internal error variable to the value indicating the type of error and return this value as the result of the call. This internal variable giArgerror can be used by ArgErr which either returns the string value of the error or the command line argument in which the error occurred, whichever is appropriate.
Finally, ArgCheck may be called to check whether the user has specified misspelled optional or otherwise unknown arguments.
Argument types and errors, the FILE_t structure, and the function prototypes are specified in args.h, which should be included in all programs that make use of this command-line argument processing.
The definition and layout of command-line arguments remains a matter of personal taste, and therefore the subject of many endless and zealous discussions (by the way, which editor is really the best?). The set of functions described here at least enabled us to standardize both the style to which command-line arguments of our programs should conform, which makes for a much appreciated common user interface, and the way they are handled or parsed in our programs. Please don't hesitate to adapt them to your personal preferences.
Ton J. Teuns
Project Manager, Software Development
AKZO Nobel Information Service
The Netherlands
I agree that uniformity of style is more important than the specific decisions made, particularly for a group of cooperating programmers. — pjp
Dear Mr. Plauger:
I just read Rex Jaeschke's article (CUJ, November 1994) on the C9X Charter, and it hasn't been long since Bob Jervis' article (CUJ, October 1994) proposing C with Classes. Naturally, I have an opinion.
First, on adding classes to C, I don't care for the idea. To me, the language described looks more like C++ with limitations. The removal of constructors and overloading eliminates a lot of functionality. How does one implement a complex or matrix class without overloading? If there is no constructor, then initialization has to be separated from instantiation, making declaration of constants and preinitialization of globals virtually impossible. Also, the addition of the inherit keyword is a capricious and counter-productive change. I believe that it is very important to "minimize incompatibilities between C and C++" (see Additional Principle #9 in Jaeschke's article).
Second, on updating the C Standard, I am glad that we don't consider C a dead language. I am also glad that we like what we have, and don't wish to overhaul it. There are some simple things available in C++ that I regularly use when writing C programs. Naturally I have to use a C++ compiler, but I typically work with modest sized, non-object-oriented programs. I would really like to see these added to the C standard, and I expect many others would also appreciate them:
1. // comments. I have never forgotten to close them, and I don't have to worry about aligning the right side of a series of end-of-line comments. It also saves three characters " */" at the end of such lines.
2. The const modifier. In a prototype, this adds information and protection. Also, local consts are more controllable than #defines.
3. Inline functions. What is the greatest number of parentheses ever used in a #define macro?
Brian Tagg
You are not alone in your conservative approach to modifying Standard C. By the way, the const type qualifier has been in Standard C for many years now. — pjp
Editor,
It is with a great deal of regret that I inform you that I am dropping my subscription to CUJ. At one time it took me several days to read and digest the contents, pore over the ads for new products, and find nuggets to add to my arsenal of tricks. Now I can do this in one sitting, mostly looking at the ads. I do not consider C++ to be a step up, but rather a step away from C, which is what I do most of.
I have studied C++ and feel it's a cumbersome, RAM heavy, resources-reliant method of doing structures and pointers. C++ is not my wave of the future. I can count the articles that have been useful to me over that past year using a 3-bit integer. I cannot justify the resources to continue a subscription that does not return value for investment. I will continue to "graze" the local Barnes & Noble and other stores for the monthly issues. If I see something of use, I'll get it then. I am shopping for a monthly/bi-monthly C/ASM journal. Know of any good ones?
Charles W. Reynolds
Senior Technical Developer/PF1
Professional Services
Tampa, FL.
charlesr@cftnet.com
You're asking the wrong guy, since I (clearly) still like CUJ. Sorry it doesn't meet your needs as well these days. — pjp
Dr. Plauger:
I have been a reader of CUJ for about a year and a half or so, ever since I learned I did know enough C to make the Journal interesting and understandable (sort of). I have recently come across a most confusing thing which stumps my more expert friends. Suppose one has a small project comprising two files (using either Borland C v4 or Turbo C v3). A global array is declared in both files:

File 1: char some_array[some_size]; File 2: extern char *some_array;
I always thought that the two declarations ought to be equivalent; that some_array in File 2 should point to the first byte of the army some_array in File 1. After all, some_array in File 1 is just a pointer. However, such is not so. As written, the pointer some_array in File 2 is an uninitialized pointer with no apparent connection to the memory allocated in File 1.
On the other hand, when the declarations are written like this, everything works fine:

File 1: char some_array[some_size]; File 2: extern some_array[some_size];
Alternatively, this form also works:

File 1: char *some_array; ..... some_array = (char *)malloc(some_size); File 2: extern char *some_array;
Why must one maintain consistent declaration styles for arrays across modules? (Not that this is a bad idea, but I thought I knew what was what.)
Thanks for a publication that is useful and challenging to the non-professional programmer who nonetheless needs to know how to write code in a work-related capacity.
Sincerely,
Jim Willemin
Geology Department
St. Lawrence University
Canton, NY
Your confusion is endemic among C programmers. Pointers and arrays really are different creatures. It just so happens that C doesn't let you manipulate arrays in expressions with the same facility it grants to the scalar types (arithmetic types and pointers). As a convenience, practically every occurrence of an array in an expression gets quietly changed to a pointer to the first element of the array. C programmers become so accustomed to using pointers and arrays interchangeably in expressions that they soon forget the important distinction between them as objects. When matching up declarations, as in your examples, no such implicit conversion occurs from arrays to pointers. Hence, the declarations must match. — pjp
Mr. Plauger,
I read your January Editoral with great interest. I live in Bolton MA and recently went through the same experience. I have also recently started using the WWW. It's exciting the information that can be accessed. I was wondering if the Journal was going to get on the net and put up a home page. I would really be interested in indexes of back issues. I have a collection going back to 1989. It would be nice to download code also.
Keep up the great work!
Larry Prucha
Last time I visited R&D Intergalactic Headquarters in Lawrence KS, there was considerable discussion about how to get a presence on the Web. There is also considerable confusion in the field. Im sure it's a matter of time before a connection happens, but I can't tell you how much time right now. — pjp
Editor:
I was pleased to see an article on Linux in your fine magazine, but I was rather disappointed at the quality of it. For instance, comp.os.linux is gone, having become a whole hierarchy. The X window system isn't called "X-Window," which is certainly capable of much higher resolution than a measly 1024x768x256. Creating a Linux partition will certainly not "destroy all data on your hard disk!"
I'm not sure how useful an article like this is: under-edited, somewhat misleading and dated. And if the reader had access to FTP, surely he would be better off reading the excellent HOWTOs and FAQs.
Regards,
Mark Hahn
hahn@neurocog.lrdc.pitt.edu
Dear Rick Roberts,
With great interest I read your article about Linux (as far as I know the first such article in CUJ). But I have the following comments:
You talk about a complicated method of shutting the system down (multiple syncing, typing halt, etc.) Most modern distributions automatically install a program (named ctrl+alt+del with execute permission for all users at the console) that shuts the system down (better to say: reboots it) when the corresponding key combination is pressed. As soon as you see "done," you can switch off your computer. In addition by default there normally is/etc/update running, which performs a sync every 30 seconds automatically. So I guess there is no need for such a complicated procedure.
The problem with your memory is a little mysterious for me. I installed Linux on many different computers with 4, 8, or 16 MByte RAM. The only time I encountered a problem was with a notebook with 4M (also this problem could be solved). Some distributions (i.e. Slackware) require the setup of a swap partition prior to installation if memory is tight (4MB).
You need not completely delete your hard disk for a Linux installation. The first possibility is of course to use a free partition (if present), the second to buy a second hard disk (booting Linux from hard drive #2 through LILO is no problem!), and lastly you can install Linux on a DOS partition without deleting any DOS files!
Sincerely,
Klaus-Peter Nischke
klaus@nischke.do.eunet.de
It's really hard for a magazine to report on "net happenings" such as the Linux phenomenon. Status can change hourly, in some cases, and we have a pipeline measured in months. Certainly you should scan FAQs to get the latest dope on any topic. But please don't judge us, or our reviewers, too harshly for a lack of depth or timeliness. (I am a nominal expert of long standing in the computer field, and my ignorance of many aspects of it is bottomless.) We're just trying to make useful resources known to a wider audience. — pjp
Dear Dr. Plauger,
Your answer to Tom Leith's question in the December issue recalled my struggles with the same problem. I too abandoned type checking here with void *, but your answer got me thinking again, and I've found a simple, fully typed solution. Your suggestion of pointers to structures was the key, but rather than returning a pointer to a structure containing the pointer to the function, I pass the pointer to the structure as an additional argument, and fill in the new state function when I return. Also, whenever I have a family of functions called indirectly, I declare the functions and pointers with a macro, so their interfaces are consistent and maintainable.
Listing 1 shows an example state machine that reads integers.
J. Greg Davidson
Institute for Software Research and Development
San Diego, CA
vis!greg@ucsd.edu
An interesting style for writing event-driven code. — pjp
Dear Mr. Plauger,
I enjoyed Bob Stout's article in the March 1995 issue. What he is describing is a technique called medial averaging (averaging about the median). I have used this technique for a number of years in a time-series forecasting program. His implementation using ring buffers for filtering is very well done. In Bob's application, if there is sufficient oversampling, he may want to drop the next highest and next lowest observations as well. In my application, I run into a problem of not knowing how many observations I have to process (the user can change the number of data points used on the fly), so I use a brute force approach and keep track of array indices to delete. Thanks for a useful article!
Dave Wadsworth