March 2000/Finding Neat Scales for Plotting

Features

Finding Neat Scales for Plotting

Antonio Gómiz Bas

Labeling axes is easy for people. For a computer, it's rather less obvious.

Introduction

Function plotting is a very common operation included in many applications. Plotting consists of several tasks, one of them being the selection of an adequate scale for both axes. A neat, legible scale may improve graph interpretation considerably.

For the sake of illustration, consider that a possible x scale for displaying f(x) = x3 - x in an interval around zero is -3.1416, -1.5708, 0.0000, 1.5708 and 3.1416. Although ¼ is a very familiar number to most of us, these plotting positions look artificial. The step 1.5708 is simply not "neat." Of course, this situation would be very different if f(x) was a trigonometric function such as sin(x), cos(x), etc. Then it would seem more natural to divide the x axis into simple fractions of ¼.

This article presents a function that attempts to work out a nice scale for a single plotting axis, given both interval bounds and the number of plotting positions. Of course, terms such as "neat" and "nice" are not very rigorous. I don't try to define them here in terms of formal mathematical properties. Rather, a neat scale is simply one that looks good to the eye. This is a subjective property, but hopefully one that is universal enough to be useful, even if it is hard to define.

The first time I faced the neat scale problem, about twelve years ago, I preferred to make use of solutions that were already available instead of writing my own function. I considered two FORTRAN subroutines: CHLON, included in [2] and SCALE, described in [1]. SCALE is more elaborate than CHLON, and it uses a longer list of neat values.

During all these years my C version of SCALE worked fine with only one, very infrequent, minor problem: the function generated some very small value, say, 9.724e-13, instead of an obvious, round 0. Since the step size was usually much bigger than this strange scale minimum, the rest of the scale consisted of the normal neat value series. However, a much more serious problem appeared recently. I found a set of initial values that made the C version of SCALE enter an infinite loop. Since translating from FORTRAN into C may have introduced subtle errors (and could very well again), I decided to write my own function from scratch while using the same approach.

Algorithm Description

The function to compute a neat step size is called Scale, as shown in Listing 1. It inputs two doubles that represent the upper and lower bounds of the plotting interval of interest, plus an integer N that represents the number of points to plot. It returns a double SMin, the actual minimum value along the graph scale that must be used to make everything work out, and Step, the computed neat step size.

The first problem is to define what a neat step is, at least in terms that will make sense to a program. A plausible starting place is to limit neat steps to numbers composed of two significant digits. Nelder and Stirling, the authors of SCALE, selected the steps 10, 12, 15, 16, 20, 25, 30, 40, 50, 60, and 80. I have included 75 to have some extra flexibility.

The second problem is to select a suitable step size. The first task is to compute an initial step, lfIniStep. This occurs just inside the outer for loop in Listing 1. If XMin, XMax and N are the interval lower bound, upper bound, and number of plotting positions respectively, then the initial step is simply:
lfIniStep = (XMax - XMin) / (N - 1)
In most cases this computed value of lfIniStep will not match any of the predetermined neat steps above. For example, using the numbers from the first example in this article, XMin = -3.1416, XMax = 3.1416, and N = 5, which results in:
lfIniStep = (3.1416 - (-3.1416))/(5 - 1)
          = 1.5708
Therefore, it is necessary to rescale lfIniStep. The neat step list spans the range from 10 almost up to 100. (100 is not included in the range because it has three digits.) So it will suffice to multiply or divide the initial step by 10 until the rescaled step, lfSclStep, is within the open range [10, 100).

Using lfIniStep from the above example:
lfSclStep = lfIniStep * 10 = 15.708
A sequential search in the neat step list will give 16 as the smallest neat step that is greater than or equal to the rescaled step. This neat step is not the value that Scale returns to its caller; but it will be used in subsequent calculations. The function remembers this value via the computed index i.

The third problem is to work out a suitable scale minimum value, SMin. There is a condition to fulfill: the original interval [XMin, XMax] must be included in the new interval [SMin, SMin + (N - 1) * Step], where Step is the final calculated neat step. Step will not necessarily be one of those in the neat step list, but it will be equal to one of them multiplied by a power of ten. In other words, the statements
SMin <= XMin
and
XMax <= SMin + (N - 1) * Step
must hold.

This is accomplished within the do-while loop shown in Listing 1. A tentative step is first calculated as:
Step = lfSclFct * Steps [i]; (Eq. 1)
where Steps[i] is the smallest step from the step list greater than or equal to lfSclStep computed above, and lfSclFct = lfIniStep/lfSclStep is the rescaling back factor (lfSclFct = 1.5708/15.708 = 0.1). The corresponding scale minimum is given by:
SMin = floor (XMin/Step) * Step
Note that SMin is a multiple of Step, which means that zero is a potential plotting position, assuming that it falls within the range of [Smin, Smax] and that it does not fall in the middle of a step. This algorithm may not generate the tightest possible scale, but I have not found it to be a problem to date.

Once having obtained SMin and Step, the function checks the upper limit: if XMax is less than or equal to SMin + (N - 1) * Step then the function returns. If not, the function selects the next value Steps[i] from the step list, and goes through the do-while loop again, starting with the recalculation of a new value for Step in Equation 1 above. This do-while loop keeps repeating until either a) the conditions listed above are satisfied, in which case the function returns a 1 (meaning success); or b) the function runs out of values to use from the neat step list.

If the function exhausts the neat step list without satisfying the conditions for success, this is a signal that the number of plotting positions was too small. The variable iNm1 = N - 1 is the number of intervals in the scale; function Scale doubles this value and runs through another iteration of the outer for loop, recomputing lfIniStep, lfSclStep, etc. and checking XMax again. This outer for loop repeats three times at most.

I have tested Scale with Microsoft Visual C++ 5.0, generating around half a billion random cases. For each of them XMin and XMax were taken from the interval [-15000, 15000] and N was assigned an integer value from 2 to 10, both included.

It never returned a zero.

Acknowledgement

Many thanks to Sandrine Levieux who revised this English composition.

Source Code Copyright Notice

This code may be freely used provided that the names of the original authors, J.A. Nelder, W. Douglas Stirling, and Antonio Gómiz, are included within the comments.

References

[1] P. Griffiths and I.D. Hill. Applied Statistics Algorithms (Ellis Horwood Limited, 1985).

[2] L. Lebart, A. Morineau, and J. P. Fénelon. Traitement des données statistiques (Dunod, 1979).

Antonio Gómiz received his B.Sc. in Mathematics from Universidad de Murcia, Spain. He usually earns his living as a software engineer, frequently as a data analyst. He has over seven years of C++ programming experience (ten years with C). He is interested in the design and development of computational libraries in C++ for areas such as numerical analysis, statistical analysis, operations research, and artificial intelligence. He is a member of the ACM, ASA, IEEE, and SIAM. He can be reached at a.gomiz@acm.org.