September 2000/We Have Mail

Departments

We Have Mail

Letters to the editor may be sent via email to cujed@cmp.com, or via the postal service to Letters to the Editor, C/C++ Users Journal, 1601 W. 23rd St., Ste 200, Lawrence, KS 66046-2700.

David Berry, author of the article, "Combining Boyer-Moore String Search with Regular Expressions" (CUJ, June 2000), has responded to several bug reports from readers. He has contributed an updated source listing, which we have uploaded to our ftp site and website (See ftp://ftp.cuj.com/pub/2000/cujjun2000.zip or www.cuj.com/code/updates.hmtl). Dave writes:

All,

Thank you for your contributions to the CRegEx class. I have been told that you can't prove that an algorithm works, only that it doesn't work. Well, each one of you have found something that did not work. As you all know, the algorithm works by breaking the regular expression into sub-expressions, then matching the individual sub-expressions. The problem with the original implementation was that I was looking for the sub-expressions in the wrong place. I needed to be looking for the sub-expressions within the Boyer-Moore loop rather than outside it.

Listing 1 shows the modified Boyer-Moore loop that correctly handles the matching of sub-expressions. I have also included the latest code for the algorithm. The code also includes a test program that can be used to regression test the algorithm.

Once again, thanks for all the help.

Dave Berry

Dear CUJ,

Just a note from a reader to say that I fully agree with the spirit of your editorial in July's CUJ.

And while I'm at it I'll also mention that I'm pleased with the CUJ's general content these days, speaking as someone who's been a subscriber on and off for the last nine years (The "off" years were prompted by the observation that I wasn't reading it, not by any active dissatisfaction with it.) Among other things I'm pleased to see that you've decided to have a regular column on Java. I think increasing numbers of C++ programmers including myself need to know Java, while remaining mostly C++ programmers, and Java is part of the C family.

Sean Furlong

Thanks for the compliments, Sean. It's also good to hear from someone who isn't violently allergic to Java. (In case anyone was wondering, though, we have no plans to add a C# column.) — mb

Marc,

I thoroughly enjoyed your Editor's Forum in the July CUJ, along with your neat "Simple Unit Tests in C++" template. Let me share my personal experience of when the light came on regarding pointers.

My introduction to pointers was in college. We learned and primarily used Pascal. It is a great language for learning, and pretty good for doing real work. Anyway, in all of the text books, pointers were always shown as little boxes with arrows coming out of them. Also, pointers were supported only when used with dynamic memory allocation, another abstract concept. Pointers themselves were "abstract." They were little boxes. They contained arrows. They could point to something, or nothing (called NIL). But you sometimes had to delete them, and sometimes re-assign them, and you had to be very carful on how and when you created/used/deleted them.

I also had taken assembly language my junior year. I understood what an address was. I understood what it meant to dereference an address in an operation.

Then, at my first job, we used C, not Pascal. I learned how to take the address of something and store it in a pointer. No dynamic memory. No little boxes with arrows in them. Ding, the light bulb came on! Pointers were addresses! Dynamic memory allocation requires the use of pointers, but not vice versa!

After just a few days, I could write C statements out the wazoo to reference some variable 14 levels deep of nesting, knowing when to use . (dot) * (dereference) -> or [index]. Statements like:
char c = **ppChar;
did not faze me. I know some C programmers who say, "I can't get this to compile, I keep adding *s in the front, but that doesn't seem to help." A little scary, huh?

Yes today, when I draw trees and linked lists, I use arrows. I don't write addresses down — I understand the abstraction of pointers. But, the light did not come on until 3+ years after I was introduced to the concept of pointers and someone showed me how they are implemented at the processor level!

Luckily, when I was first introduced to references in C++, I was told they are basically pointers without the pointer syntax. So I understood references pretty quickly.

Regarding your testing template, I hope I get to try it out soon. We're about to start a rather large project dealing with some embedded programming. I'd like to test some of my routines utilizing the BaseTestxx templates. Our professional discipline needs more ideas like this to promote developer testing. Even though testing is usually very specific to the problem at hand, your BaseTestxx templates can be applied across many, many routines.

Regards,

Mark Franjione
Sr. Software Engineer
Host Engineering, Inc.

In my electrical engineering days we used to have an inside joke about how to handle a circuit that was too noisy. We would say, "put a capacitor on it." It was sort of the electronic equivalent of your scary C programmers adding asterisks to declarations. Amazingly, adding a capacitor sometimes worked, but when we did that we sure weren't doing engineering. There's no substitute for knowing how something really works. — mb

Marc,

I enjoyed your article on unit testing in the July 2000 issue of CUJ. I've been wrestling with a variety of implementations of such a framework and haven't come up with one quite so elegant.

I write software used to design highway bridge structures. As such, I use a lot of floating-point numbers and my unit tests must compare these numbers. Using operator== usually isn't a good idea because of small round-off errors (1.00000001 is equal to 1.0 for my purposes but == evaluates to false).

A nice refinement to your test framework would be the use of a predicate object to determine equivalence. Much the way the STL uses predicate objects.

Richard Brice, PE
Software Applications Engineer
Washington State Department of Transportation,
Bridge and Structures Office

Yes, I also soon discovered that my framework wasn't so hot for testing routines that produced floating-point outputs, for the very reason you cite. There is a workaround, but with too many workarounds such a tool becomes more trouble than it's worth. I should also point out that if you supply non-primitive types as template parameters, you'll probably have to supply overloaded versions of operator<< to print out their values.

I like your idea of using STL-style predicate objects, but have not had time to try it out. I imagine there are several ways the framework could be improved. In fact, not long after the article was published, a reader sent in an idea that seemed simpler in most respects; we may end up publishing it as an article in a future issue. Also, if you haven't already done so, check out Chuck Allison's article, "The Simplest Automated Unit Test Framework That Could Possibly Work," elsewhere in this issue. He takes a somewhat different approach. (He also wins this month's award for "The Longest Article Title That Could Possibly Get Past an Editor.") Thanks for writing. — mb

Dear CUJ,

Michael Bramley presents a useful algorithm for "Data-Based Axis Determination" (CUJ, July 2000, p. 20). But it could be improved.

The algorithm should be scale-invariant, in the sense that multiplying the inputs by any power of 10 should multiply the outputs by the same power of 10. For example, (1,10) as inputs gives (1,10,1) as outputs, so any input of the form (10n, 10(n+1)) should give (10n, 10(n+1), 10n). Table 1 in the article includes several examples in this pattern:
Data   Data   Scale  Scale  Scale
min    max    min    max    inc
---------------------------------
 0.01    0.1   0.01    0.1   0.01
 1      10     1      10     1
10     100    10     100    10
But that invariance is broken by the fixed value 1e-10 (I'm assuming the test on fabs(Test_min) is activated.) Suppose you're measuring a fast reaction in picoseconds, and you want values from 1e-12 to 10e-12. The scale minimum gets reset from 1e-12 to zero. A simple fix is to multiply the tiny cutoff value by the range so it scales with the data.

Another problem is that the roundoff can cause an additional decrement loop, resulting in a smaller minimum than necessary. For example, for data from 10 to 11, the calculated scale minimum is 9.9 instead of 10. That can be fixed by moving the test inside the loop and comparing against Min rather than zero:
do {
   ++i ;
   Test_min -= Test_inc ;
   if( fabs(Test_min - *Min)
          < 1E-10 * Range )
      Test_min = *Min ;
   }
while( Test_min > *Min ) ;
A sampling of problem cases appears in Figure 1.

Incidentally, the algorithm does not guarantee "at least six major tick marks." For example, data from 0 to 11 gets only four ticks:
Data  Data  Scale  Scale  Scale
min   max   min    max    inc
-------------------------------
 0    11     0     15      5
This could be fixed by looping on the ( i < 6 ) fixup.

Finally, pow10(x) is not a standard function; it could be pow(10.0, x).

Peter Hollinger
peterh@intex.com

Dear CUJ,

Thanks for Micheal Bramley's interesting and insightful article "Data-Based Axis Determination" (CUJ, July 2000). It was especially interesting to me, since I had come up with almost the exact same algorithm while creating a ruler control for a windowing package for my employer. Given the number of pixels available between the whole units and the minimum space between tick marks, the number of tick marks between whole units can be calculated thus:
// Minimum pixels allowed between
// tick marks
int minPixPerTick = 10

// 2.0 for inches, 10.0 for cm & mm
int baseMode = 10.0

// Use resolution and zoom
// to determine
int pixelsPerUnit = 100;

int ticksPerUnit =
   pow(baseMode,
      floor(log(pixelsPerUnit/
         minPixPerTick)/
            log(baseMode)));
Regards,

Scott Miller
Lectra Systems, Inc.
s.miller@lectra.com

Michael Bramley replies:

Scott:

Thank you for your kind and generous comments regarding my article! In hindsight, I am not surprised that you too had discovered this elegant, yet simple, little algorithm to make dealing with graphics easier. However, I now wonder how many other programmers are out there that read my article and looked at their code, thinking, "Gee, I've done this..."

It's interesting to note that this concept does generalize nicely to other "graphics," such as rulers. This only goes to demonstrate that it is one of those algorithms with many uses — all we have to do is find the problem it solves!

Sincerely,

Michael Bramley

Cha Geil!