Letters

Robotics

Dear DDJ,

Since DDJ has its roots in hobby computing, it's great to see DDJ providing coverage of the rapidly growing field of hobby robotics. In his April 1999 editorial, Jonathan Erickson credited Dave Baum with creating RCX Command Center. Dave wrote NQC (Not Quite C), while RCX-CC was created by Mark Overmars. I purchased one RIS in early October for my 11-year-old son. By November, "we" had a second RCX. By December, we had expanded our robotics inventory to include a Handy Board by Fred Martin of MIT. And now, in early March, my 11-year-old is building a hexapod walker out of A/C plywood and a BASIC Stamp 2 brain -- an IR obstacle detector and two touch sensors. The LEGO RIS makes a fantastic introduction to mechatronics. It would be nice to see an entire issue of DDJ devoted to hobby robotics.

Nick Taylor

ntaylor@iname.com

Thread Communication and Parallel Algorithms

Dear DDJ,

In his April 1999 DDJ article "Thread Communication in Parallel Algorithms," Lalit Pant gives an algorithm for a breadth-first multiway tree search using a queue. He then says that using a stack would have resulted in a depth-first traversal. That is not quite correct. Using the simplest case of one root and one child, switching out the queue for the stack would not change that the first element processed would be the root node. In a depth-first traversal, the child should be processed first. Creating a nonrecursive depth-first traversal (using either a queue or stack) is sufficiently more complicated, which I will not detail here. Perhaps I'm chickening out, but I realize that it's certainly more work than the replacement suggests.

Robert Konigsberg

rikonigsberg@sai2000.com

Lalit replies: Thanks for your note, Rob. In general, the term "depth-first traversal" (or "depth-first search") is used in connection with graphs. For a tree, which of course is a special case of a graph, depth-first search is equivalent to preorder traversal (see Algorithms in C++, Third Edition, by Robert Sedgewick, Section 5.8, "Introduction to Algorithms," by Cormen, Leiserson and Rivest, Section 23.3; and Principles of Compiler Design, by Aho and Ullman, Section 13.3). In depth-first search/traversal, the node representing the data structure you want to traverse is visited first. Nodes connected to this node are visited subsequently. The order of these subsequent visits has a very specific nature, of course. But the key point here is that the node representing the data structure to be traversed (the root in the case of a tree) is visited first. So, the statement that forms the basis of your argument "In a depth-first traversal, the child should be processed first" is incorrect.

Analyzing Algorithms

Dear DDJ,

I enjoyed Jon Bentley's April 1999 "Analysis of Algorithms" article as it took a well-known problem with some well-explained mathematical analysis and a couple of those "neat tricks" that come in handy when you least expect it. My tendency is usually to go straight for the language- and platform-specific performance enhancements. Jon's article highlighted the importance of also ensuring the algorithm is optimal.

Late at night I decided to play with the algorithm and the exercises Jon suggested. While rereading the article I remembered Jon had profiled the time spent in each function. I pondered the swap function (swap two integer elements of an array). It also took up quite a bit of time, and more importantly was called often.

Initially I thought this was not a candidate for algorithmic optimization -- "it's just swapping two integers." But then I remembered from a flicker of recognition whilst inspecting his algorithm that sometimes swap would be working on the same two numbers. A quick bit of instrumentation code verified that this was so.

Why bother swapping the same number? In fact, why bother calling the swap function at all in that case? Some of my language-specific performance disciplines kicked in to help, to save a function call and an extra condition. Still, I think the following is an algorithm improvement more than any other type of performance tweak.

For brevity, here is the code adjustment for version 4 of the algorithm:

for (i = m-1; i >= 0; i -- )

{

#if DONTSWAPEVEN

if (i != m-1)

{

#endif

swap(i, m-1);

search4(m-1, sum + d(p[m-1], p[m]));

swap(i, m-1);

#if DONTSWAPEVEN

}

else

search4(m-1, sum + d(p[m-1], p[m]));

#endif

}

So now the swaps are only called if the two numbers are different. Of course, the search is called in every case. I'm not sure how much this theoretically speeds up the code because I think it will depend on the relative time in the swap function versus the other functions. I counted the number of swaps and the number that were removed by this optimization, but haven't analyzed the pattern yet:

n=3: 4 total - 2 not needed

n=4: 18 total - 8 not needed

n=5: 80 total - 34 not needed

Small tests I've run on low values of n show a reduction in total run time of about "1/n"th using Visual C++ 6 on Windows NT 4, Intel Pentium Pro.

Joshua Graham

jagraham@computer.org

Dear DDJ,

I've enjoyed reading Jon Bentley's columns since they first appeared in the August 1983 issue of Communications of the ACM, and wish to use this opportunity to thank him for his articles. I am delighted to see his work published in DDJ.

Jon's attention is focused in April and May 1999 on the Traveling Salesman Problem (TSP). Several valuable lessons are drawn as to where to look these days in attempts to speed up calculations. However, one point Jon has perhaps missed can be briefly stated: "Don't calculate unnecessarily."

In the function geomdist(), d=sqrt(a2+ b2) is used to calculate the distance between two points. Since sqrt() is responsible for 64.6 percent of the execution time of the original program, the question to be asked is: How can we dump sqrt()?

In the TSP, the distances between points are summed to produce an overall "cost" for a route. But the exact distance between points is not required; rather, what is required is the relationship of sums of distances between points. It is easily seen that that relationship is preserved when we use the square of the distance, rather than its exact value. That is, square both sides of the distance equation to obtain d2=a2+b2, and use d2 as the "comparison value." Of course, the squaring operation has taken us back to the Pythagorean theorem, whence the distance equation was derived.

Because the relationship is preserved, we can discard sqrt(), and geomdist() becomes:

Dist geomdist(int i, int j) {

return (Dist)(sqr(c[i].x-c[j].x) + sqr(c[i].y- c[j].y));

}

Voila, 64.6 percent faster, give or take. This clearly illustrates that a change of algorithm brings great improvement to our programs. For this reason, I continue to anticipate Jon's columns eagerly.

Peter Roth

peteroth@erols.com

Command-Line Arguments

Dear DDJ,

I've enjoyed Al Stevens's column for years. A couple of notes on the doorknob column in his May 1999 column.

1. Perhaps now you understand a bit better why people like me who started programming on UNIX found DOS, written nearly 10 years later, to be so primitive. Subnote: In 1980, we got a Motorola 68000 desktop workstation with 2 MBs of memory and a 100K UNIX kernel (and a $10K price tag -- not many were built). We sometimes had up to four people simultaneously editing files or what not. One MB (as with ATs) should be enough for a stripped-down one user -- few process system, with a good shell (and all the standard utilities on disk). So the limited memory excuse does not completely wash Microsoft.

2. Contrary to your assertion ("[C compiler builders] choose not to expand file specifications"), Borland did provide such as an option at least by Turbo C 2. So I never had to write such a routine as you did. The real question, if you are otherwise correct, is why did not Microsoft do so?

Terry J. Reedy

tjreedy@udel.edu

Al responds: They did, as many readers pointed out. See the "C Programming" column in this issue for more information.

Simulated Recursion

Dear DDJ,

The "simulated recursion" technique Earl Augusta described in the April 1999 issue of DDJ is useful not only in languages that do not support recursion, but also as an optimization technique in languages that do support recursion. Many recursive procedures, though not quite tail recursive, are finished with most of their local variables by the time they reach the recursive call. Thus, pushing an entire stack frame, with all of its local variables, is wasteful; one can optimize by explicitly stacking only the recursion's control parameters rather than actually making a recursive function call. The impact of this optimization varies widely -- on an architecture with many registers and a good optimizing compiler it may have little effect -- but I've often seen it dramatically speed up recursive routines, even on RISC machines with (allegedly) good compilers.

Joe Ganley

joe@ganley.org

Y2K Bug?

Dear DDJ,

There's a novel Y2K bug in the advertisement for Objectivity (April 1999, page 37). The ad gives 2000 bc as the date for Archimedes, who actually flourished in 250 bc. It looks like someone is treating negative dates as 2000.

Doug McIlroy

doug@tahoe.cs.dartmouth.edu

DDJ