Dear DDJ,
I'd like to thank Ken Kashmarek ("Letters," DDJ, February 1991) for taking the time to write concerning my article, "Optimal Determination of Object Extents" (DDJ, October 1990). I'm glad to hear he enjoyed it. I completely agree with Ken about the column order reversal. Speedup, however, is defined as a positive quantity; otherwise the "up" portion of the word wouldn't make much sense. When I wrote the article, I thought computing percentages would be obvious, so I didn't spell out the details. Anyway, what's really important is speed improvement. Anybody can come up with a slower algorithm, its the faster ones that are difficult.
For the most part, Ken's comments detailed "absolute performance." My article, however, was about "relative performance"--that is, one algorithm relative to another. Whatever hardware, compiler, switches, or floating-point element size you use, as long as you are consistent with comparisons the results will always be about the same (the MIN/MAX algorithm will be faster, since it performs less work to reach the conclusion, as long as you must perform comparisons of items to find min and max!).
Also notice that the data sample used makes a difference. For consistency, I used the same data sample over all machines. All times were the user CPU time since that's how long the computer took to execute the program. I believe I used an array of 1000 items and ran a loop finding min and max of these items 5000 times (this varied from one machine to another to get accurate time value). For timing, I used a program I wrote that's similar to the Unix time command.
As for Ken's comments about the Cray, remember that when you spend over $10 million on a computer, you get one of the best parallelizing compilers available (otherwise the hardware is kind of hard to use). This compiler completely unfolds the loops and runs the code straight. But even the Cray vector processor can't avoid doing comparisons; therefore it, too, will show relative performance improvement. Yes, the Cray does matrix math faster, but that's because it can do an addition and a multiplication in one cycle (just like Ken's IBM 6000), and because it runs on a 250-MHz clock (vs. the 25-MHz IBM 6000).
As for Ken's other comments:
Indianapolis, Indiana
Dear DDJ,
This is a reply to Bruce Tonkin's response (DDJ, March 1991) to my original letter concerning his review of Power Basic:
Where Bruce uses numbers in his reply, the numbers are essentially correct, but his conclusions about the numbers are not. The main problem seems to be that Bruce assumes: 1. That programmers will be willing to convert their (possibly) large volume of source code to another format and syntax just to gain some small advantage, when it is not known how long that particular version of the language will be supported; where MS-DOS Basic will likely dominate on PCs for years to come, and 2. that other programmers will mostly use the same (inferior) methods of coding that Bruce describes in his reply. To compare a few reply points:
a. Basic 7.x allows multiple 64-KByte string heaps.
b. It is much faster to sort in a pre-allocated string (memory) buffer and not in string arrays.
c. A few large string buffers can be "concatenated" using simple algorithms to simulate a very large memory block, much as one would do in C.
Dale Thorn
Cleveland, Tennessee
Dear DDJ,
In the October 1990 issue of DDJ, I read a letter by Michael P. Lindner. It was titled "Still Going in Circles" and was feedback on Tim Paterson's article, "Circles and the Differential Analyzer" (July 1990). I was impressed by Paterson's article. There are a few suggestions that I would like to make.
I implemented the algorithm as it was on DOS, Lindner's Example 3. It is duplicated here as my Example 1.
x = e = 0
y = r
while (x <= y)
{
plot8 (x,y);
e =+ (x<<1)+1;
if (e > 0)
{
e -= (y<<1) -1;
y --;
}
x ++;
}
The resulting curve had a blip, as shown in my Figure 1, which is extremely noticeable on large circles.
The reason for that is that if the curve passes through a pixel, the "e" for that point is positive and so the pixel below it is chosen. What results is the plotting of pixels just inside the circle, not on it. The first pixel is the start and is on the circle -- hence the blip.
In this algorithm the "e" for all the points would be zero or negative. A better algorithm would make sure that "e" is kept around zero. A measure of this "closeness" could be the sum of "e" for all the points. I'll call it E. My Table 1 shows E for the fragment in Example 1 for various radii.
Radius E 150 -14705 175 -19942 200 -27301
I was trying to write a small fast version in assembler, so adding more variables or complicating any expression was out of the question. By changing the condition in the if statement, a better result can be obtained. I tried two other simple conditions (e>x) and (e>y). The corresponding E1 and E2 are in Table 2 for the same radii.
Radius E1 E2 150 -10998 689 175 -14895 -558 200 -17244 1090
It's obvious from Table 2 that E2 is the best bet, i.e., if (e>y). /*line7*/, where E is around zero.
I found out E for radii from 1 to 1000 using this new condition. E was positive for 724 of the cases and negative for the remaining 276 cases. This also eliminates the blip. The resulting circle is closer to the true circle and smooth and continuous. Just for statistics, I'd like to mention that in the assembly language version I had kept "e," "x," and "y" in registers. Making this change meant a reduction of the code size by 1 byte. The resulting code would be just a wee bit faster, as the immediate operand (0 in this case) need not be fetched.
V. Venkataraman
Pune, India
Dear DDJ,
As many others probably already have, I regret to inform you of Edward Allburn's misconclusions in his article entitled "Graph Decomposition: Imposing order on chaos," in your January 1991 issue. Mr. Allburn claims -- because it is apparent -- that his algorithm is in 0(n2) in worst case. Without going into details about graph theory, given a graph G with n vertices, the most number of edges e is n(n- 1)/2. This is common knowledge in graph theory. A graph that satisfies this property is called a complete graph. Understanding this yields to show that the algorithm which Mr. Allburn claims as 0(n2) is actually 0(n3). A complete graph is the worst possible case for Mr. Allburn's algorithm. Given a complete graph g with n vertices and e edges given to the algorithm in the order (1,2), (1,3), ... (1,n), (2,3), ... (n - 1, n) the algorithm must loop through step #1 in Mr. Allburn's algorithm (n-1)n/2 times. This step alone is in 0(n2). Now the loop in step #4.d.1 (see Allburn's Example 2) adds another degree of complexity, making the algorithm run in 0(n3).
I tested the algorithm to see hard results. I implemented the algorithm in Turbo C++ (see Listing One, page 14) and timed its execution, providing complete graphs from N = 4 to 1000 step 4. I then took the time T(N) it took to run the program on N vertices and found its ratio to N2, N3, and N4 and plotted them on a graph (see Figure 2). This graph shows how the algorithm grows with respect to the tree growth functions N2, N3, and N4. From the graph it can be seen that the graph of the ratio of N2 to T(N) is decreasing. This says that T(N) is growing faster than N2. Likewise, looking at the ratio N4 to T(N) the curve is increasing, saying that T(N) does not grow as fast as N4. Finally, the graph of the ratio of N3 to T(N) is nearly constant (slightly increasing). This says that T(N) is growing as fast, but not faster than N3, which is the definition of Big-Oh.
There are many ways in which general graph theory algorithms can be tailored to individual problems to speed up algorithm execution. Knowing more information than the general algorithm requires is always an asset. I did enjoy reading Mr. Allburn's article and liked the development of the algorithm using a new data structure. But I also wonder if Mr. Allburn really tested his algorithm thoroughly before making his claim and if his understanding of Big-Oh notation is such to make that claim.
Martin Schlapfer
Scotts Valley, California
Edward responds:
In his test, Mr. Schlapfer used a "complete" graph. For a directed graph that translates to N* (N- 1) edges, and 1/2 that number for undirected graphs (where N = Number of Vertices). The graph I used in my article was a "sparse" graph which contained the minimum number of edges needed to specify that all of the vertices are connected. This works out to be N- 1 edges for an undirected graph.
In Big-O notation, "N" is intended to represent the amount of input into the algorithm. For a graph, the units of input are traditionally either edges or vertices. When the number of edges is about the same as the number of vertices (as was the case in my example graph), then "N" can safely represent either edges or vertices. When faced with "complete" graphs, however, then number of edges is an order of magnitude greater than the number of vertices. Because of this "N" must now represent the number of edges.
After reviewing my test results, I realized that this should be the case even when the number of edges is less than the number of vertices. For example, in the case where a graph is be comprised of 1 million vertices and only a handful of edges, the GAD algorithm is finished as soon as the last edge has been read . Because GAD's performance is tied to the number of edges rather than vertices, "N" should represent edges in the analysis of this algorithm performance. Thus, GAD is still O(N2) in the worst case. I thank Mr. Schlapfer for bringing this to my attention.
Dear DDJ,
I am the representative of the Consulting and Engineering Agency, a new private Rumanian company of consulting engineers, and I am writing to ask your readers for help.
After the Rumanian revolution in December 1989, we realized that it was vital for Rumania to try and turn its economy around from the bankrupt socialist system into a market-driven economy. One area which needs desperate and particular attention is computers and computing techniques.
Twenty of our best software and hardware engineers from Constanta -- drawn from computer centers, universities, and other enterprises -- set up this agency. The first major problem we face is a severe lack of scientific books, manuals, magazines, and catalogs.
We realize that one of the most effective means of ridding our country of the terrible memory of Communism is not simply by appealing for clothes and food, but by asking for help in improving our education. In this way we can improve our standards in industry and commerce, and help our people to make the things they need themselves.
We would be very grateful if any of your readers could donate books and software, which is particularly expensive for us. Perhaps even donate a subscription to their favorite magazine, or even send us old issues that they no longer need.
Donations should be addressed to either of the following:
Aurel Cartu, Manager Consulting & Engineering Agency
Aleea Brindujelor Nr 2,
Bloc L9, Sc C,
Ap 45 RO-8700
Constanta
Rumania
Guilain Devillers
A.I.D.E.
P.O. Box 54
L-8001 Strassen Luxembourg
Copyright © 1991, Dr. Dobb's Journal