August 1993/We Have Mail

Departments

We Have Mail

Dear Dr. Plauger,
The article on date conversions by David Burki (February CUJ) has saved me considerable head-scratching in searching for efficient algorithms to use in a financial futures analysis program.
The functions ToJul and FromJul provided by Burki are considerably more compact than the functions Julday and Caldat given in Numerical Recipes [1], which perform similar date conversions. The Numerical Recipes functions use floating-point arithmetic, which makes them much slower on PC-compatibles without a math coprocessor (see Table 1 and Table 2) . However when a math coprocessor is present, the Numerical Recipes functions are faster.
Burki's ToJul function is a direct C implementation of the original FORTRAN algorithm referenced in his article. The function's speed can be increased by more than a factor of five by replacing the two divide-by-4 operations with bit-level right shifts, and by replacing long int calculations by int ones where possible. Further speed improvement is gained by calculating the term (lmonth-14L)/12L once only, and using the result in three places in the main expression. The resulting function, ToJul2 (Listing 1), is more than 2.5 times faster than Numerical Recipes' Julday, even when a math coprocessor is present (Table 1) .
The use of int expressions in ToJul2 means that the latest Gregorian date which can be converted is 29 February 27868, on machines where an int is 16 bits. This limit will be adequate for most purposes! The maximum date for Burki's ToJul is 31 December 32767. Both routines are valid only for dates which give positive Julian day numbers. Note that the day before Gregorian date 1 January 1 A.D. (Julian day 1721426) is Gregorian date 12/31/0 (Julian day 1721425), which is in year 1 B.C.
Burki's FromJul function can be modified in a similar manner to use bit shifts in place of one multiplication and two divisions. These changes, together with only using long int arithmetic where necessary, result in the new function FromJul2. This function is nearly 80% faster than FromJul. FromJul2 is still slower than the Numerical Recipes function Caldat when a math coprocessor is present, but is considerably faster when the coprocessor is absent (Table 2. )
These new functions have been checked against the routines given by Burki and against the Numerical Recipes routines for all dates from 15 October 1582 to 31 December 5000. For dates before the adoption of the Gregorian calendar (15 October 1582), the Numerical Recipes routines assume a Julian calendar (although this is not explicitly stated in their text), whereas the functions given here and in Burki's article assume a proleptic (projected back) Gregorian calendar.
The day of the week can be calculated from the Julian day number using the expression (jul_day+1)%7. A zero result corresponds to Sunday, 1 to Monday, and so on, up to 6 for Saturday.
Reference:
[1] W.H. Press, B.P. Flannery, S.A. Teukolsky and W.T. Vetterling. Numerical Recipes in C. Cambridge University Press, pages 10-13.
Table 1. Normalized execution times, relative to ToJul2, for functions which calculate the Julian day number from a Gregorian date. The speeds of ToJul and ToJul2 are unaffected by the presence of a math coprocessor. (* means with math coprocessor present).
Function  Execution Time

Julday    46.74
Julday*   2.74
ToJul     5.38
ToJul2    1.00
Table 2. Normalized execution times, relative to FromJul2, for functions which calculate the Gregorian date from the Julian day number. The speeds of FromJul and WOIFromJul2 are unaffected by the presence of a math coprocessor. (* means with math coprocessor present.)
Function  Execution Time

Caldat    8.83
Caldat*   0.49
FromJul   1.75
FromJul2  1.00
Yours sincerely,
Adrian J. Hill
P.O. Box 11583
Berkeley, CA 94701
Tel/Fax: 510 559-8835
Thanks for the refinements. The code you sent in is available on the monthly code disk for this issue. — pjp
Dear Dr. Plauger,
Kudos to David Burki for "Date Conversions" in the February 1993 issue. It's the first time I've seen integer-based calendar routines published in C.
A lot of us adapted the code in Jean Meeus' 1985 book Astronomical Formulae for Calculators. (December 1990 Computer Language has a C implementation.) Meeus incorrectly assumes that every date since 15 Oct 1582 must be in the Gregorian Calendar. And he uses floating-point math — no penalty on his HP-67 target, if somewhat costly on a PC. His routines do work, however, unlike some earlier ones.
I'd used them until a couple years ago when, in a program that otherwise used no FP math, my crappy former C package pulled in most of a 16K emulation library just for a little date arithmetic. I wrote new integer routines. Those who don't have such time to invest will appreciate the code Mr. Burki dug up. Plus it's four times faster than Meeus'.
But (and you probably guessed there was one coming), that code, oo-la-la! I'm all for code reuse, but perhaps Messrs. Fliegl and Van Flanders should have been left for the archaeologists. Calendar formulae are not very complicated, and a straightforward C routine is smaller and five times faster than their FORTRAN one.
I wouldn't complain about slowness in code intended to demonstrate an algorithm. But that surely wasn't the goal of the Messrs. Did they yield to that academic temptation to be fashionably opaque? Or am I blaming them unfairly for FORTRAN's limitations?
Slow code should be avoided in the user interface. An important use for calendar routines that Mr. Burki neglects to mention is validating date input. Send a calendar date on a round trip through the two JD functions. If you get back the same month, day, and year, then such a date exists.
I imagine you're awash in home-grown calendar routines about now. Let me throw in my two bytes' worth too. Feel free to use these old routines for publication, or CUG library inclusion. Or landfill enhancement.
New topic. You wrote last year that readers should consider code in CUJ as proprietary unless the author says otherwise. I've said otherwise in my calendar module, but it raises an important question.
Am I alone in having always assumed the opposite — proprietary if and only if explicitly declared so? How about a standard intro in all your printed source? The author could spell out how she or he wants the code used: Read-only, noncommercial-use, any-use, or don't-know (when an author adapts someone else's published routine).
The Lawyers have learned that we programmers are out there. If we're all reeeal careful, we might keep them at bay for a couple more years.
With compliments for the best programming magazine,
Bob Twilling
P.O. Box 6129
Bozeman, Montana 59771
Well, there you have it. People want code that is a) tighter and faster, as well as b) looser and more readable. So do I.
As for code ownership, you should bear in mind two fundamental principles. First is that copyright ownership vests with the original author automatically. Your company probably claims all your stuff under your (possibly implicit) employer/employee contract. CUJ purchases limited rights to the stuff we publish. Beyond that, it's up to the original author to declare how freely the code can be used. I agree that it's useful for authors to make explicit statements on the topic. Many, in fact, do.
The second driving principle is, the more money's at stake, the hungrier the lawyers get. Use a few lines of my code in a personal project and I'll probably never know. If I find out, I may well be advised to care only if I found out you profited significantly from my work. (But then again, I might get testy anyway — you never know.) The typical software ripoff that I've seen more likely involves hundreds or thousands of lines of code, essentially unmodified. The case is usually so open and shut that a competent lawyer can force a settlement well short of going to court.
Contrast that with the Hollywood scene. Producers are regularly sued over the ownership of concepts. These aren't even the detailed expressions that copyright was meant to protect. Yet serious money regularly changes hands over such squabbles. Why? Because even more serious money is at stake with practically every movie made these days.
What makes software interesting is that it more and more often also involves serious money. On top of that, it's full of concepts that are hard for a judge to understand, let alone a jury. Moreover, it can be copied without loss of quality, unlike older protected modes of expression. The moral: err on the side of safety when you use other people's software, to whatever degree.
Yes, the old days had a delightful innocence and simplicity. On the other hand, in the old days I couldn't have earned enough as a programmer to pay cash for my Saab. — pjp
Dear Mr. Plauger:
I have tried to teach myself the C language. With the help of some great books by SAMS and the Waite Group*, I have had some success. Now that I can put together code that does pretty much what I wish, I frequently seek more indepth answers from these introductory texts. Unfortunately, both texts were written prior to adoption of the C Standard. Thus, the answers/confusion that I derive from these texts may have already been clarified. I have two queries.
The first pertains to arrays and embraces the distinction between initialization and assignment. External and static arrays can be initialized upon declaration. Automatic and register arrays cannot. However, the latter two classes of arrays can be assigned values. I have used assignments during declaration of the automatic array type without obvious problems. Isn't this the same thing as initialization?
Perhaps initialization only pertains to loading some arbitrary value into each array element, as opposed to the elements containing random memory garbage, when the array is declared, if an immediate assignment is not made. This entire question may have been rendered moot by the standard. Please define "assignment" and "initialization," as they apply to these array scopes.
The second problem involves using void pointers. I would like to write generic functions that accept pointers of any type. For example, a single "swap" function might be used to exchange the values of two pointers, of any type. Void pointers would seem a reasonable way to do this, but this just doesn't work:

swap(void *x, void *y) { void *t; *t=*x; OR t=x; *x=*y; x=y; *y=*t; y=t; }
It won't accept any pointer type except void. I can get the same effect by declaring/defining WOIswap as having two arguments, each of which is an unsigned int pointer. In this latter instance the compiler whines about a "suspicious pointer conversion," but the program happily swaps chars, floats, ints, arrays of various types, etc. when the function swap is given arguments that are addresses of pointers.
I'm confused, a not unusual condition in my case. My experience may well be colored by my choice of compilers. I use Borland's Turbo C, version 2.0. All of these questions have probably been answered several times in various august publications, but I have missed them, somehow. Sorry for the dopey questions, but I really have no other source for answers. There aren't a lot of folks, in my immediate vicinity, who program, much less in C.
Your very truly,
R. F. Anthracite, M.D.
192 Grand View Ave.
Valparaiso, Florida 32580-1541
* Waite M, Prata S, Martin D. C Primer Plus. ISBN# 0-672-22090-3.
Prata S. Advanced C Primer++. ISBN# 0-672-22486-0.
You can stop apologizing. You've just tripped across two of the thornier aspects of C, standard or otherwise. The philosophical aspects go on forever, but I'll try to be brief and to the point.
As for arrays, you cannot assign them. Never could, still can't. They're intentionally second-class objects in C, for a variety of reasons. What's confusing is that the notation for writing an initializer in a declaration uses the equal sign =, just like the assignment operator.
You could always initialize external and static arrays. What the C Standard added was the ability to initialize automatic arrays as well (The very existence of register arrays is problematical.) The initializer for an automatic array can still contain only constant expressions —the kind that the compiler can evaluate prior to program startup. But here is the one place in the C language where you can cause an array to be copied at runtime (other than as part of a structure or union).
You touch on a particularly profound issue here. C++ found it necessary to distinguish clearly between initialization and assignment. C leaves the distinction murky. What the C Standard added for array initialization makes the issue even murkier.
Now let's look at your attempt at a generic swap function. Yes, you can convert any object pointer to a void pointer — that's why we added them to Standard C. But a swap function has to know, by some separate means, what object type you intend to manipulate. You then convert the void pointers back to the appropriate type of pointers to reach the objects. Or you use a function such as memmove to copy the appropriate number of bytes, if you know that number.
The various versions you describe for your swap function fail in various ways. Trying to deference a void pointer causes a compiler diagnostic, as you observed. Just swapping the pointers themselves is acceptable, but accomplishes nothing useful. Declaring the arguments as pointers to some fixed type makes the function safe only for swapping objects of that type.
The cleanest C version needs a third argument, as I hinted above. Provide the size of the objects to be swapped and void pointers to the objects. Then swap them a byte at a time. You can write a macro version that expands to inline code. It can know the object type well enough, but has trouble declaring the temporary you need in a convenient place.
Once again, C++ has a more graceful answer. You can define a swap template that expands to just the code you need. It can even expand to inline code and still declare the temporary with suitable locality. — pjp
Dear Dr. Plauger:
For the last year I have been trying to teach myself C using Quick C. Even though I am somewhat of a novice programmer with limited experience in FORTRAN and SAS, I have learned a great deal.
My goal is to learn to write custom database applications. However, I don't want to reinvent the wheel. I would like to take advantage of products for the C program developer, such as a library source of functions and/or a code generator for designing user interfaces, including menu and window systems, data entry and validation, file management utilities, and graphics libraries.
I believe that such libraries could be a good learning tool. I have been working with Pinson's Designing Screen Interfaces in C, which I received through the C Users Group, and it has been very good. But I need more than the topics he covers.
There seem to be numerous products available. I have requested a few demo disks from providers, but what I really need is some sound advice from professional developers. I would also like to get my hands on some comprehensive review articles on the above topics.
I thank you in advance for any of your recommendations.
Sincerely,
John Wood
ENT Services, Inc.
5621 29th St. Cir. E.
Bradenton, FL 34203
Your instincts are good. Keep reading, both articles and ads. Keep looking at demo disks. CUJ doesn't do product reviews, but we do publish user reports on a regular basis. And maybe some of your fellow readers will pass on their personal experiences to you. — pjp
Sirs,
Please find enclosed a copy of my reaction to Mr. S. Sanders on his letter published in C User's Journal, March 1993, pages 124-126.
Yours sincerely,
HAHN INFORMATICA
Ir. H. Hahn
Dear Mr. Sanders,
This is a reaction to your above mentioned letter in C User's Journal.
Some time ago, I ran into a similar problem, though in a quite different context. I do have a theory of how the things you describe happen, but as I have no EGA or VGA available, I am not in a position to check my theory.
The video interrupt (INT 10H) handler contains several subfunctions. Some of these handle characters while certain others handle strings (such as function 13H). On my Hercules card (a clone, actually), I found that function 13H scans the string and then generates interrupts 10H for each character scanned. In case an LF (OAH) is scanned (and the cursor is at the bottom line), this will be INT 10H function 06H ("scroll up").
It seems that on (some or all?) EGAs and/or VGAs INT 10H function 06H does not scroll directly but instead calls a subroutine which does the actual scrolling. Function 13H (the string scanner) may call this subroutine immediately when it scans an LF, instead of first generating an INT 10H function 06H. This may explain why your INT 10H function 06H is not called.
The only reason I can imagine for them to do it this way is a matter of speed. Interrupt handlers usually push all registers, and execution of a software interrupt involves a pushf (push flags) followed by a far call (equivalent to two pushes). This would be repeated for every character in the string. However, the subroutine needs only push/pop those registers it actually modifies, and it can be called with a near call. This may save many stack operations, thus speeding up string function 13H.
I would suggest that you also intercept this string display function. Scan the string until (but not including) the next LF and chain to the old handler for that part of the string. Then handle the LF your own way and repeat this process until everything is done. (Note: When intercepting INT 10H function 13H, be sure to handle the 'write mode' flag (register AL) and the attribute correctly, or strange things may happen.)
As I said, I am not in a position to check this method. But I would very much appreciate to hear from you whether or not this really solves your problem.
Yours sincerely,
HAHN INFORMATICA
Ir. H. Hahn
And another reader replies to S. Sanders:
Dear Mr. Sanders:
After reading your letter in The C Users Journal, March 1993, I took to researching your problem in PC Interrupts, by Ralf Brown & Jim Kyle (Addison-Wesley Publishing Company, 1991).
According to Brown and Kyle, INT 42h is the replacement interrupt for INT 10h for default video services when INT 10h is being used by the video card's ROM, which sometimes occurs on EGA or VGA systems (5-72).
This probably would be the reason for your resulting errors on your AST Premium with VGA and your Packard Bell with EGA.
The only conflict I could see with using INT 42h was on a Heath-Zenith Z100, where INT 42h is the hardware interrupt for the timer (2-9). Otherwise you should be able to use INT 42h the same way you are using INT 10h.
The aforementioned book can be helpful in other interrupt problems you may run into, including vendor-specific hardware and software interrupts. It also has references for DOS 5.0; the cost is only $33.
I hope this information will help you.
Sincerely,
Steve Ferrell
2324 West Tenth Street
Duluth, MN 55806-1237
Thanks to both of you. — pip
Dear Editor,
As an ex-systems programmer I keep a keen interest in the developments in the programming world. This has to be kept at a hobby level since my occupation nowadays is in the banking world. Sometimes, however, I get an urge to indulge (well, it feels that way to me) in low-level programming on the bare bones of my PC. My language of choice nowadays is C++ with a bit of assembly added where necessary.
I have included an implementation of the GetPrivateProfileString and WritePrivateProfileString from Windows 3.x available on the monthly code disk — pjp]. Use it, abuse it or disregard it. My only requirement is that I do not take any responsibility whatsoever for the use of the program, nor for the actual usability of it, nor for any compliance to any standard whatsoever. The program is hereby donated to the public domain.
Best regards
Gunnar Hellquist
Kammakargatan 52
S-111 60 Stockholm
Sweden
Thanks to you, too. — pjp
Dear Mr. Plauger,
I am a deaf person, working as a system programmer in C++. I was programming in C for more than three years. I thoroughly enjoyed programming in this language, which is being improved to perform any kind of task used in computers today. But unfortunately the editing capabilities to input data from the keyboard are still poor, using the scanf and gets functions. Is there any enhanced function that allows me to validate input immediately and alert the operator on mistakes during data entry? That is, to be able to use the arrow keys, to delete a character, and to insert a character, etc.
I tried to solve this problem by writing a separate function to distinguish between the extended code keys and text keys using the scanf function through a long loop, but it seems inflexible and long. Can you kindly provide me with any other means of solving this problem? I am sure most readers and programmers would be interested to see a solution for this problem in The C Users Journal. I hope soon. Thank you.
Yours truly,
Saleh M. Thenyan
P.O. Box 61241
Riyadh, 11565
Saudi Arabia
You describe a common problem, for which there seems to be no universally good common solution. CUJ has, from time to time, published articles describing various input editors. I'm sure the C Users Group can provide even more. Various commercial libraries also offer interactive input editors. But as you learned from trying to write the code yourself, it is not trivial to do well. All I can suggest is that you look at some of the sources I mentioned to see if you can save yourself some tedious work. — pjp
Dear C Users Journal:
While I enjoy Plauger's articles, I do not like the changes in the magazine since he took over.
Last year I cancelled Dr. Dobb's for turning their magazine into an outlet for frustrated ACM Journal contributors. I am a real-life programmer, but I do not have the patience or time for esoteric academic studies in computer science. I read magazines like Dr. Dobb's and The C Users Journal for real-world solutions, and just for fun — an escape from the five other dry trade journals I must read weekly or bi-weekly.
You seem to be trying to turn The Users Journal into the same academic elitist rag that Dr. Dobb's has become. I implore you to come to your senses. There is a much larger market for a magazine that's a fun escape for programmers. Remember the old Dr. Dobb's? The old Byte? Creative Computing? Micro-80 Journal? They're all gone. No more hobbyest publications — it's all business now.
Give us a break. Don't take the last holdout down the drain!
Let me know if you change your format. I'll take another look. In the meantime, I'm sorry to say — goodbye.
Jon Baker
8511 Shadywood Dr
Tulsa, OK 74131
Well if Dr. Dobb's Journal of Computers, Calisthenics, and Orthodontia (running light without overbyte) is an elitist rag then so are we. All those other mags you listed changed or went out of business because they couldn't sell ad space (or even enough subscriptions) in their older forms. I too miss some of the lightness of days gone by, but I didn't set out to murder it. It just went away as the field matured. Goodbye. — pjp
P.J. Plauger:
I would just like to say that I concur wholeheartedly with your recent editorial in The C User's Journal. I too, have suffered through this never-ending New England winter and have also been overwhelmed by the amount of software technology which seems to be out there. As hardware advancements make possible faster and bigger software applications, the growing layers and complexity can often be mind numbing. (Not to mention the documentation!) I believe this is not just a problem in the computer-science software industry but a problem of the 20th century in general. We are surrounded by technology which we depend upon but know little about. The electricity which is powering my terminal, or the gas which heats the house, are all out of my technological grasp if something were to malfunction.
The same can be said for software. When you have many layers of complex software running and something goes amiss, your ability to comprehend and fix the problem depends on your specialization. It is almost impossible to be an expert from the hardware level to the application software development level. You have probably heard of James Burke, the writer. He has an excellent book called Connections, which touches upon technological questions such as these.
I enjoy your magazine immensely, and keep up the good work.
Thanks,
Thomas Lohman
Research Specialist/Software Engineer
MIT Building 36 - Room 297
Cambridge, MA. 02139
e-mail: thomasl@mtl.mit.edu
Phone: 617-258-6485
Thank you. — pjp
Dear Bill,
They Shall Not Parse!
Thoroughly mull over each C User issue and seldom fail to find some life-changing nugget. Your reports on the ISO/ANSI C peregrinations are especially appreciated. The semantic complexities of this relatively simple artificial language serve as a damper to Russell Suereth's naive optimism ("A Natural Language Processor," CUJ, April, 1993). His article reminds me of the giddy 1950s when we thought that NLP could be "done" by extracting words and looking them up in a dictionary file. Alas, even when the dictionary was replaced by an increasingly complex thesaurus, the dream flickered and died. Russell talks about tagging words in his dictionary as noun, verb, et al. This is the AI version of "belling-the-cat"! It assumes that NL "words" carry predeclared data types. If only! The polysemic truth is that even the ubiquitous street-walking, hill-climbing "Bill" can suddenly show up as a verb, impersonal noun (or, horror, as a second William appearing in the story) leading to the big, intractable crux known as "contextual disambiguation." The equally naive tagging of agents and actions as "human," "inanimate," etc., hits the same brick wall. Compare Russell's "Only humans can drive" with "The car drove me mad."
Russell's answer to extending his Sesame Street examples to "reading a book" is reminiscent of the earlier NLP hopefuls: all we need is more syntactic parsing and bigger dictionaries. AI history is full of similar "exciting potential extrapolations." Chomsky has compared this "convergent optimism" with the idea that by adding 6" to the long-jump record, Man is well on the way to flying. Current trends in NLP involve the creation of huge databases of human knowledge just to cope with narrative "comprehension" in very limited domains. The results, like Russell's, may be useful for, say, DB queries with small vocabularies, but they hardly merit the "natural" prefix in NLP.
Stan Kelly-Bootle
Author: "The Devil's DP Dictionary"
Contributing Editor, UNIX Review
PS: My choice of "Sesame Street" is a tad misplaced: that show proves that even at the lowest grades, children's language exhibits complex patterns of humor and irony far beyond "if (name = = subject)" analysis.
PPS: If Russell does succeed and produces a formal ANSI specification of English, will you handle the Defect Reports?
Stan Kelly-Bootle has long written his own exotic brand of English for Computer Language, UNIX Review, and other publications in our field. He is always worth trying to understand. — pjp
Dear Mr. Plauger,
In the July 1989 issue of The C Users Journal, you published some routines for modifying the MS-DOS master environment. I found those routines very useful. With the advent of Desqview and Windows I had to modify them a little so they only modified the parent shell's environment rather than the master.
It came to my attention recently that the original master environment routines were no longer working under DOS 5.0. I dug into it a little, and believe that the following change to the m_findenv function will fix the problem.
Change:

SHELLpsp = MK_FP(FP_SEG(CONFIGmcb) + CONFIGmcb->len + 2, 0);
To:

SHELLpsp = MK_FP(CONFIGmcb->owner_PSP, 0); /* locate the PSP for the initial */ * command.com which should be the */ * first one where the par_seg */ * points back to the same PSP */ while (SHELLpsp->par_seg != FP_SEG (SHELLpsp)) { CONFIGmcb = MK_FP(FP_SEG (CONFIGmcb) + CONFIGmcb->len + 1, 0); SHELLpsp = MK_FP(CONFIGmcb->owner_PSP, 0); }
Making this change causes the function to locate the first MCB with a PSP that points back to itself. I have only tested this on DOS 3.31 and 5.0, but it works fine there.
I really enjoy your magazine and can't count the times that you published a program that was exactly what I needed at the time.
Thanks,
Don Brown
don%bumpo@motoman.pnl.gov
Thanks. — pjp