September 1991/Standard C

Columns

Standard C

A Math Sampler

P.J. Plauger

P.J. Plauger is senior editor of The C Users Journal. He is secretary of the ANSI C standards committee, X3J11, and convenor of the ISO C standards committee, WG14. His latest book is The Standard C Library, published by Prentice-Hall. You can reach him at PJP%plauger@. uunet.uu.net, uunet!plauger! pjp.

Introduction
For the past two months, I have been discussing the functions declared in <math.h>. I began with a brief history of the evolution of the C math library. (See "Math Functions," Standard C, CUJ, July 1991.) I followed with a guided tour of the seminumerical functions. These illustrate the underlying primitive functions that manipulate floating-point values portably, rapidly, and safely. (See "Math Primitives," Standard C, CUJ, August 1991.)
My aim this month is to show you just a handful of the math functions that constitute the real meat of <math.h>. There are well over a dozen such functions (depending upon how you count pairs with similar requirements). It would take far too much space to show them all. Thus, I focus on a representative few.
I don't want to dwell on this topic at great length for other reasons as well. It takes considerable mathematical sophistication to appreciate, or even care about, many aspects of the math functions. It is not the purpose of this column to indulge such sophisticates. Rather, the purpose of Standard C is to shed light on many interesting aspects of the C language as it has been standardized by ANSI and ISO. That provides information of use to the vast majority of C programmers.
To the extent that the math library is an important component of the C language, you should know something about it. The C Standard contains a few subtleties regarding error reporting and support for IEEE 754 floating-point arithmetic. Sometimes, it is easier to show you examples than to just talk about such subtleties. That was part of my motivation in selecting the particular examples I show here. Beyond that, many of you may have little need or desire to inspect <math.h> at close range.
For those of you who want to see more, you're now in luck. You will find the complete code for <math.h>, and all the rest of the Standard C library, in my new book. The Standard C Library is published by Prentice-Hall, and the code disk is available from the C Users Group. (See the ad near the center of this magazine.)
Inside that book you will also find quite a bit of material from earlier Standard C columns. I have been building toward this opus for some time. If you find a sudden urge to refer back to an earlier column on the Standard C library, you now have an alternative to digging through your old issues of CUJ.
In fact, this particular column represents a kind of watershed. It is the first of a series I intend to base on material from the book instead of the other way around. (No, I don't just cut and paste words between files. The needs and perspective of a magazine article differ in several ways from a technical reference book.) We've barely come halfway through my promised tour of the library. (For the original perspectus, see "With Gun and Reel," Standard C, CUJ, September 1990.)

Function sqrt
Listing 1 shows the function sqrt, which computes the square root of its argument x, or x1/2. It is in many ways the easiest function to understand, because it is based on the simple but elegant Newton's method. (Make a guess, divide that into the argument, and average guess and quotient to get a better guess.) Still, that approach only works for arguments greater than zero. The function must report an error for negative arguments and handle zero as a special case.
IEEE 754 introduces several other special cases as well. Like the code I showed last month, sqrt must also test for various infinities and not-a-number codes. The actual algorithm almost gets lost in the added complexity.
sqrt partitions a positive, finite x, using _Dunscale, into an exponent e and a fraction f. (I showed the code for _Dunscale last month.) The argument value is f*2e, where f is in the interval [0.5, 1.0). The square root is then f 1/2*2e/2.
The function first computes a quadratic least-squares polynomial fit to f1/2. It then applies Newton's Method three times to obtain the needed precision. Note how the function combines the last two iterations of the algorithm to improve performance slightly. _Dscale (also shown last month) puts the pieces back together. That's all there is to it.

Functions cos And sin
Listing 2 shows the function _Sin. It computes sin(x) if qoff is zero and cos(x) if qoff is one. Thus, these two functions become themselves trivial. They are, in fact, also implemented as masking macros in <math.h> to eliminate one level of function call.
Using such a "quadrant offset" for cosine avoids the loss of precision that occurs in adding p/2 to the argument instead. I developed the polynomial approximations from truncated Taylor series by "economizing" them using Chebychev polynomials. (If you don't know what that means, don't worry.)
Reducing the argument to the range [--p/4, p/4] must be done carefully. It is easy enough to determine how many times p/2 should be subtracted from the argument. That determines quad, the quadrant (centered on one of the four axes) in which the angle lies. You need the low-order two bits of quad + qoff to determine whether to compute the cosine or sine and whether to negate the result. Note the way the signed quadrant is converted to an unsigned value so that negative arguments get treated consistenly on all computer architectures.
What you'd like to do at this point is compute quad*p/2 to arbitrary precision. You want to subtract this value from the argument and still have full double precision after the most-significant bits cancel. Given the wide range that floating-point values can assume, that's a tall order. It's also a bit silly. The circular functions become progressively grainier the larger the magnitude of the argument. Beyond some magnitude, all values are indistinguishable from exact multiples of p/2. Some people argue that this is an error condition, but the C Standard doesn't say so. The circular functions must return some sensible value, and report no error, for all finite argument values.
I chose to split the difference. Adapting the approach used by Cody and Waite (see the July installment) in several places, I represent p/2 to "one-and-a-half" times double precision. The header "xmath.h" defines the macro HUGE_RAD as:
#define  HUGE_RAD  3.14e30
You can divide an argument up to this magnitude by p/2 and still get an value that you can convert to a long with no fear of overflow. The constant c1 represents the most-significant bits of p/2 as a double whose least-significant 32 fraction bits are assuredly zero. (The constant c2 supplies a full double's worth of additional precision.)
That means you can multiply c1 by an arbitrary long (converted to double) and get an exact result. Thus, so long as the magnitude of the argument is less than HUGE_RAD, you can develop the reduced argument to full double precision. That's what happens in the expression:
g = (x - g * c1) - g * c2;
For arguments larger in magnitude than HUGE_RAD, the function simply slashes off a multiple of 2*p. Note the use of the primitive _Dint to isolate the integer part of a double. Put another way, once the argument goes around about a billion times, sin and cos suddenly stop trying so hard. I felt it was not worth the extra effort needed to extend smooth behavior to larger arguments.
The rest of the function _Sin is straightforward. If the reduced angle g is sufficiently small, evaluating a polynomial approximation is a waste of time. It also runs the risk of generating an underflow when computing the squared argument g * g if the reduced angle is really small. Here, "sufficiently small" occurs when g * g is less than DBL_EPSILON, defined in <float.h>. Note the use of the double constant _Rteps._D to speed this test. (I described_Rteps last month.)
The function _Sin uses _Poly to evaluate a polynomial by Horner's Rule. It's a simple function, so I won't bother to show it here.

Functions exp And Friends
Another group of functions consists of exp, cosh, and sinh. In principle, you can write the latter two in terms of the first. That leads to all sorts of nasty intermediate overflows, however. What you want instead for these is a function that computes 0.5*ex safely. Thus, I introduced the function _Exp (x, n). It computes 2n*ex for x zero or finite. Scaling safely by a power of two is easy given _Dscale.
Listing 3 shows the function exp. All it has to do is check for the special IEEE 754 codes, then call _Exp. I don't have space to show you cosh, but it's only a bit more complicated. sinh is more complex still because you have to use a special approximation for small arguments.
Listing 4 shows _Exp. The header "xmath.h" defines the macro HUGE_EXP as the carefully contrived value:
#define HUGE_EXP (int)(_DMAX * 900L / 1000)
This value is large enough to cause certain overflow on all known floating-point representations. It is also small enough not to cause integer overflow in the computations that follow. Thus, HUGE_EXP offers a coarse filter for truly silly arguments to _Exp.
The trick here is to divide x by ln(2) and raise 2 to that power. You can pick off the integer part and compute 2g, for g in the interval [-0.5, 0.5]. You add in the integer part (plus eoff) at the end with _Dscale. That function also handles any overflow or underflow safely.
Reducing the argument this way has many of the same problems as reducing the arguments to _Sin described above. The one advantage here is that you can choose extended-precision constants c1 and c2 to represent 1/ln(2) adequately for all reasonable argument values.
As usual, the reduced argument is compared against _Rteps._D to avoid underflow and unnecessary computation. The ratio of polynomials approximation is taken from Cody and Waite. The approximation actually computes 2g/2, thus the correction to xexp.

Summary
That should give you a taste of what it's like to write portable and accurate math functions for Standard C. These functions indeed lose no more than two bits of precision in all the tests they have undergone to date. As I warned you earlier, however, you must bring to bear quite a lot of machinery from numerical analysis to do a proper job. I didn't hesitate to rely on several references by people much more expert than I on this arcane topic.
The lesson I hope to convey is that the actual computation of each math function doesn't take all that much code. (The nastiest is pow whose special cases puff it up to over a hundred lines.) The code must be carefully contrived, to be sure, but it is reasonably straightforward. Where the complexity comes in is handling all the error cases properly. IEEE 754 may have made floating-point arithmetic more robust, but it has certainly made this aspect of programming much harder.