Dr. Dobb's Journal September 2002
Dear DDJ,
Michael Mitzenmacher's use of multiple hash functions ("Good Hash Tables & Multiple Hash Functions," DDJ, May 2002) was interesting. However, I must take exception to his cavalier suggestion that discarding data be considered a legitimate approach to handling bucket overflow.
First, hashing algorithms should expect bucket overflow and handle it gracefully. The cost of decent handling is quite small. A simple and robust approach is to move the key to the next not-yet-full bucket (mod m). The lookup function then searches buckets accordingly.
Second, does anyone really believe IP addresses arriving at a router form a uniform distribution? Certainly not. And with error retries due to dropped packets, the algorithm belies such an assumption. The failure probability should be considered "very small," but the order of magnitude is questionable. As a practical matter, miniscule odds represent events that occur during acceptance testing or when the stakes are very large.
Third, built-in unreliability in embedded systems is likely to be silent; designers are not going to tell their supervisor the system may fail. And such information would never make it past marketing. Users are the victims of the bad design. There is no cost tradeoff presented; this appears to be strictly a question of good design and professionalism versus the alternatives.
Richard Mickelsen
rich.mickelsen@ctg.com
Dear DDJ,
The April 2002 issue of DDJ is one of my favorites since I started reading the magazine. The articles on Java, particularly Jonathan Amsterdam's one on the new operator and Max Poliashenko and Chip Andrews's article on ASP.NET, were great. One thing I figured I'd mention is a minor misconception about ASP: Within the first couple of paragraphs, Max and Chip mention that with each context switch the server has to turn off the ASP engine which is stated to be costly. Our good friends at 4guys ran some tests recently to prove that it is actually faster to interweave ASP with raw HTML. It would be more efficient to use <p>hello world at <%= now()%></p>, rather than the more common response.write "<p>hello world at "& now() &"</p>." Here's a link to the tests: http://4guysfromrolla.com/webtech/ 021302-1.shtml. Keep up the great magazine.
Justin Perkins
jperkins007@yahoo.com
Dear DDJ,
In his article "The D Programming Language" (DDJ, February 2002), Walter Bright refers to the need in C to make two versions of each file (header and source). I've always found this convention to be rather weird because there is absolutely no need for it. Take the following two examples.
/* Example 1; C written as Fortran */
extern int foo(int n)
int main()
{
return(foo(5));
}
int foo(int n)
{ return(n);
}
/* end of C as Fortran */
/* Example 2; C written as Algol60 */
int foo(int n)
{ return(n);
}
int main()
{
return(foo(5));
}
/* end C written as Algol60 */
See what I mean? In the second case, there is less code and less chance of confusion, so why do it the Fortran way? What makes this really bizarre is that C is derived from Algol60 (probably via Pascal, but I wouldn't bet on that), so why the insistence on following Fortran convention? Of course, the antiAmerican explanation would be that Fortran was invented by Yanks and Algol by a bunch of Europeans, but surely not.
Actually, I suspect that this reflects some deep cultural divide between lovers of complexity (most hackers) and those of us who just prefer clean code that doesn't waste mental effort.
Tom Groves
TEG451013@FreeUK.com
Dear DDJ,
The Code Red virus has shown once again that computer security is stuck in a primitive rut. While virus scanners do good business hobbling along checking for last week's viruses, most gurus only say users should be educated not to start "programs" (in the broadest sense) they are sent. As so often, the users' wishes are reasonable, and Linux, Microsoft Windows, and other operating systems and mailers should satisfy them, in spite of what experts say.
If someone sends me what might be an amusing program, I want to try it out, and I should be able to do so in safety. What is needed is a way of starting any program in a sort of padded cell, where it can show me things, ask me questions, and have a little space to work in. It should, however, not be allowed to change anything on my computer, nor to mail or print, without my permission, which could be asked as the need arises or in advance. In practice, most programs are allowed to do anything I want and Microsoft personal systems give users enough rope to hang themselves at any time.
In fact, the padded cell is an old idea, and any computer that runs Java properly could provide a reasonable one, though asking permission is less common. My advice is: By all means click on all your attachments, but first get a proper e-mail system a few Linux users will be able to make themselves one, most of the rest are at the mercy of Microsoft or the Open-Source movement.
You would admittedly not lock an unknown plumber in a padded cell, but one would normally keep an eye on him; most systems give him the keys to the safe and turn their backs.
Patrick Traill
Patrick.Traill@soz.pinkroccade.nl
Dear DDJ,
I must say I was a bit surprised at the level of response I got to my recently published article "A C++ Socket Library for Linux" (DDJ, June 2002). Many thanks to all the interested readers. In response, I have set up a web site for the SocketCC library (http://www.ctie.monash.edu.au/SocketCC). There is also a new version of SocketCC that: fixes a couple of minor bugs; improves implementation of the IPAddress class, lazy evaluation of the reverse DNS lookup leads to speed improvements of up to five seconds for each IPAddress assignment; and SocketBase is now more thread safe and more suitable for use in multithreaded applications.
This latest version of SocketCC is available on the web site along with other information about the project. Again, many thanks to DDJ readers for their feedback.
Jason But
jason.but@eng.monash.edu.au
DDJ