June 1995/We Have Mail

Departments

We Have Mail

Dear C/C++ User's Journal,
Just a quick addition to the article by Philip J Erdelsky, Ph.D., on "Portable Byte Ordering in C++" (CUJ, January, 1995). As a Macintosh owner and programmer, I have had a need to read and write from DOS binary files. I use the functions shown in Listing 1 frequently.
I am pretty sure these functions are portable and more efficient than performing bit shifting.
Phil S. Bolduc
Kamloops, BC Canada
They're reasonably portable, and they do solve the specific problem of converting between PC and Mac integer representations.— pjp
Dear Mr. Plauger:
Here's a quick question. Given all the systems capable of compiling C code (for convenience, let's restrict the domain to PCs in the continental US), what percentage are loaded with the software to compile C++ code? Also, at what rate are the C++ capable systems growing? (I guess that makes two questions.)
My concern is of the "shareability" of C++ vs. that of C. Are there a sufficient number of systems to warrant a programmer to write most code intended for public distribution in C++? (Well, perhaps three questions. However, I realize the third is rather ambiguous and dependent on the circumstances.)
As long as I'm writing, kudos for the magazine. Keep up the good work.
Thomas Phllips, CCP
72713.2100@compuserve.com
These days, I suspect the majority of PCs have a C++ compiler if they have a C compiler. It's much easier for most vendors to package two such complementary products together than to sell them separately. Shareability is more than a matter of widespread access to compilers, however. Standard C is a highly stable language, while C++ is still changing rapidly as a result of the invention occurring in the standardization process. If you want excitement and innovation, C++ is where it's at. If you want portability and stability, stick with Standard C for now. — pjp
To whom it may concern:
I have been engaged in a heated argument with IBM concerning the XL C v1.3 compiler supplied with their AIX v3.2.5 operating system. The problem stems from their very narrow interpretation of the ANSI Standard concerning the types char, signed char, and unsigned char, and more importantly pointers to these objects. I provide the following code to demonstrate the problem when strict ANSI was turned on.
#include <string.h>
int main( void ) {

   char buf[25];
   unsigned char ubuf[25];

   /* line below is fine */
   strcpy( buf, "Hello World" );
   /* line below: compiler error*/
   strcpy( buf,"Hello World" )
   return 0;
}
The prototype for strcpy in <string.h> is:
char *strcpy( char *, char * );
When strict ANSI is turned on, the unsigned character array is considered to be a distinctly different type than just plain char. I then consulted the C compiler manual and found a compiler option, -qchars=unsigned, which led me to believe this would alter the behavior of the compiler to treat plain char as unsigned char. Yet, this did not work either.
I contacted IBM and pointed out this seemed to be a bug (for reasons I will later state). This was their response:
"In XL C v1.2.1, a variable declared as type char was considered to be the same as unsigned char or signed char. XL C v1.3 recognizes char, unsigned char, and signed char as distinct types. This is in conformance with the ANSI standard. This also means a pointer to any of these types will not be compatible with pointers of either of the other two types."
I then pointed out my use of the -qchars=unsigned compiler option. As strange as things seem up to this point, their reply is incredible:
"The XL C v1.3 -qchars=unsigned flag is working correctly. (I am assuming that your customer is using the XL C v1.3 compiler.) By using the -qchars=unsigned flag, the char is now considered to be a type of unsigned char which is distinct from a true unsigned char."
I have never heard of two types of unsigned char before! I began consulting my sources concerning ANSI/ISO C (I have several). In Plauger and Brodie's ANSI and ISO Standard C, I found that type char can be the same as either unsigned char or signed char (page 45). Since IBM said they were following the ANSI Standard I bought a copy. (Cost me $40.00 too!) I have been poring over the text of the Standard, and although I cannot find a statement as definitive as the one in Plauger/Brodie, I also cannot find a statement which declares the char types as distinctly different. In fact, the standard strongly implies the char types are compatible and thus pointers to these objects are compatible.
I suppose I am seeking some vindication from someone who is more of an authority about Standard C than me. Normally I would not care, but our shop standards state that all production code will pass the ANSI compiler without warnings or errors. Also, we are using a pre-compiler which generates unsigned character arrays. This has forced us to back away from the ANSI compiler. This is a situation where it seems the C compiler is evolved into a Pascal compiler (Shudder...).
Anyone want to comment?
Thanks,
Stan Milam
Midlothian, TX
milam@metronet.com
IBM is quite correct, although the wording of their response leaves something to be desired. So too does the C Standard in this area, and I guess Plauger & Brodie as well, since you were able to misread it. The trick is to distinguish between representation and type. The three character types are indeed distinct, but they have only two different representations. Remember that int can have the same representation as either short or long but it is still a distinct type. It's much the same with the signedness of the representation of plain char. What adds to the confusion, as IBM points out, is that earlier compilers often failed to make all three character types distinct. Some C++ compilers still fail in this area. — pjp
To Stan Milam,
Sorry for troubling you again, but I'm fairly new to C and haven't had much experience yet. I finally found your routines for converting dates (CUJ, MONTH, YEAR) and seem to be having a problem with the date 31/12/2000. This converts fine using mkdate and returns a value of 730485. If I pass this value back into localdate I receive a date back of 00/00/2001 with -1 in tm_yday.
I think the offending line in localdate is the following
year = (unsigned) ((day * 400L) / 146097L + 1L);
This returns 2001 into the year field.
Hope you can help.
jason@easysoft.com
Stan Milam replies:
Yes, you have found a bug in my code :( ! I have only had time to verify the bug, and it is as you say: localdate is computing the wrong year for Dec 31, 2000. I believe this has to do with the fact that 2000 is a leap year where century years are not normally leap years. I am already into the process of finding a fix and when I have one I will mail it to you. By the way, I am sending a courtesy copy of this letter to The C/C++ Users Journal, and I will have to notify them of the fix too. Of course, I will give credit to you for identifying the bug. Thanks for your use of the date routines and for your courtesy.
Regards,
Stan Milam
Editor:
I have been a reader of The C/C++ Users Journal for about four years now and I am learning more and more about C, even though I do not have much time to do development in it. I also run a BBS specifically for C Programmers (Numan's — A Programming Forum (404) 498-7905).
In that light I would like to put the C Standard on-line for people to download and access. Is this available via ftp? If so what is the site?
Thank you for a wonderful magazine. Keep up the good work.
Michael Rowe
Team OS/2
Soon to be OS/2 Developer
ANSI and ISO are still struggling with the issues surrounding on-line availability of standards. Until they work out the proper checks and balances, we are not at liberty to make the C Standard available on the net. — pjp
Dear Bill,
I was just reading Chuck Allison's article in the February 1995 issue (Vol. 13, No. 2) and have a few comments to make.
While the code in Listing 1 may well work for WORD equal to unsigned int, the NBITS macro is based on the premise that the number of bits in an object of some type equals CHAR_BIT * sizeof(object). This is not required by Standard C, nor is it promised.
Specifically, not all the bits allocated to an object need be used to represent that object's value. For example, some implementations on IEEE-based machines map long double onto the 80-bit extended representation. Borland's compiler does this and says that sizeof(long double) = 10, as you might expect. However, Metaware's compiler says the size is 12 because it pads out the object to a size that is a multiple of 4, for efficiency reasons. Similarly, Cray's compiler says sizeof(short) is 8 even though only 32 bits are used (24 on older systems).
Chuck correctly states that size_t is either unsigned int or unsigned long. (By the way, he doesn't say, but offsetof also has type size_t.) However, having established that size_t is an abstract type, he goes on to promise printf that size_t is int, and that is a bad idea. (In his Listing 1, * is given an expression of type size_t and in Listing 2 - Listing 4, a conversion specifier %d is used. Both expect an int argument.) This is a very common error. Unfortunately, on many systems, this "little white lie" goes undetected. However, it's a bug that just might rise up to grab you someday. (Consider a 16-bit machine that supports objects larger than 64KB. ints will likely be 16 bits and size_t will likely be 32 bits.)
Here's the solution to the %d problem:
printf("%lu", (unsigned long)sizeof(x));
size-t is either unsigned int or unsigned long so the cast is always safe. Regardless of which type size_t has, once printf is called, the argument being passed will be of the type promised.
Note, however, this won't work for the * precision case, which expects an int and there's nothing you can do to change that. To make this work you can use the following:
printf("%*d", (int)sizeof(x), i);
It's very unlikely you'll want a precision greater than INT_MAX (which is at least 32767)!
These solutions might not be pretty but the resulting code is portable. And if you really care about portability, you have to care about this kind of detail.
Along these lines, Listing 1 could be enhanced slightly to correctly handle WORD being any integer type, including unsigned long. I'll leave the implementation as a reader exercise. (I'll send a copy of my new Standard C reference card to the first reader who sends me a complete and elegant solution.)
Rex Jaeschke
Chair ANSI C Committee
rex@aussie.com
The C Standards committees have become more sensitive to the issue of unused bits lately, thanks to a finicky Defect Report or two. Rex is, of course, correct. But you can blame me for letting the slight inaccuracy get into print. I chose not to have Chuck Allison add such subtleties to an otherwise clear explanation. — pjp
Editor:
Quick note to express thanks and kudos for the recent seminar. I managed to hit up management at just the right time — that, and the fact that I live in Topeka (so no travel charges) allowed me to attend. I was impressed by the level of information presented, and by the caliber of speakers. Good Job!
On another note: Additional kudos for your columns in CUJ I read them each month, whether I can follow them or not, just for the technical info I can soak up.
You and your cohorts at CUJ/R&D/whatever are really filling a need, and doing it well. Keep it up!
Kurt Duncan
Wolf Creek Nuclear
kudunca@wcnoc.com
Thanks. — pjp
Editor:
While programming an autorange function for drawing graphs, I was reminded of some basic computer science that we all try to ignore. (See Listing 2. )
Joe McCarty
Thanks — pjp
Editor:
This is Goldstar software LTD in korea. I'm working on numeric calculation using C. My problem looks like this:
long double a = 0.123456789012345;
double b = a;
Real memory:
b => 0.12345678901234999...
Is this a bug in the C Compiler? I hope to know about this problem as soon as possible. Thanks.
jazz@star.gsw.re.kr
or ysw@oass.gsw.re.kr
No, it's a result of the finite precision of floating-point representations. See previous letter. — pjp
Mr. Plauger,
I was sorry to learn that the March of CUJ contains Ken Pugh's last Q&A column. I'll miss it. My route through your magazine for the past several years has been your Editor's Forum first, the Q&A column and "We Have Mail" second and third in random order, followed by your Standard C/C++, Chuck Allison's Code Capsules, and Dan Saks C++ column in no particular order. I always check the new CUG releases and I usually find time to read about three of the articles. You have a fine magazine and I'll be looking forward to the replacement for the Q&A column.
Guess I better send in my subscription fee for another couple of years.
Cheers,
Harry
hphilips@epix.net
Glad you're staying around. — pjp
PJ,
I'm wondering why it is considered acceptable and "standard" to generate C++ header files with an extension of .h, like Standard C header files. I can understand if the header file is readable by either compiler but not if it is a C++ header file only.
Also, I know that other extensions are acceptable, as in .H, .h++, .hpp, .hxx, and .HH — but they are rarely used. Not using these extensions can be confusing in determining whether a header file can be included in a C program. It requires actually looking in the header file.
Shouldn't this be mentioned as a convention in the ANSI Standard for C++? Not a fixed rule or anything, just identify it as a convention which will usually get people to start using it more regularly. Currently, we have GUI builders that generate C++ code with header files that have the extension of .h, which means that we are proliferating a bad thing.
Thanks for your interest and time. Also, thanks for all the effort you put into making the CUJ such an interesting and informative magazine.
Chris Carlson
Well... actually, the draft C++ Standard has chosen to add yet another convention. Absent any widespread agreement on how to name C++ headers, the Library Working Group has chosen to make all standard C++ headers with no suffix. I don't yet know a good style rule to recommend for user-defined C++ headers.— pjp
Dear Editor,
I have a question about the offsetof macro. Is this a run-time or compile-time macro, like the sizeof function? The only documentation I can find on it seems to inply that it can be used at compile time to get the byte offset of a part of a structure. I would appreciate any discripion you can give.
Thank you for providing us with such a fine magazine.
Robert Williams
RobertW809@aol.com
offsetof is a macro that yields a constant integer expression, as does the sizeof operator. Hence, you can use both to initialize static data, determine array sizes, etc. — pjp
To the Editor:
Let's say you want to write a Windows DLL in C++. When you declare a function as pascal in C++, the function name becomes upper case and the calling convention is adjusted to that of Pascal, but the name is still mangled. I've tested this with Borland C++ v4.02, Microsoft Visual C++ v1.5, and Symantec C++ v6.1. Of course, each compiler mangles the name differently.
This is all well and good if you are only going to use the DLL with C++ programs compiled with the same compiler, but it is not good if you are going to use the DLL with Visual BASIC or with a program compiled with another C++ compiler, or even with a program compiled with the same compiler in C mode. It seems to me that you have to figure out the name mangle for your compiler (e.g., use the -S flag in Borland C++) and then use an assembly-language thunk from a reasonable name to the mangled name. Do you have a more elegant solution? Is there a solution for people who don't have a stand-alone assembler or are assembler challenged?
Thank you very much.
Sincerely yours,
Joel M. Rubin
Not that I know of, Unfortunately, there are no standards — de facto or otherwise — for name mangling, even when all compilers run on the same operating system. — pjp