It's All In the Numbers

C/C++ Users Journal August, 2004

I spend a certain percentage of each workday reviewing a dozen or so newsgroups. Mostly, these concern the C and C++ programming languages, how they are used, and how people perceive their respective Standards. I'll have more to say on newsgroups in a future installment, but since this is a special issue on numerical programming, I'll narrow my focus for now.

I recently completed a protraced interchange about, of all things, how to compute the sine function (and cosine, and tangent). You'd think this would be a hard topic to mine for any controversy. After all, programming languages have been offering sine functions for over half a century now. The function is astonishingly well behaved, varying between -1 and +1 for all finite arguments. No infinities or poles to dance about. Moreover, once you reduce the argument magnitude to less than /4 radians, or 45 degrees, it's an easy function to approximate with a polynomial resembling a truncated Taylor series. What could be simpler?

The trouble all comes with the business of argument reduction. You have to effectively add or subtract /2 repeatedly until you get the argument into that golden range small enough for easy approximation. You can forget about any multiple of 2* you add or subtract—that corresponds to whole turns around the circle. The low two bits of the multiplier tell you what quadrant contains the argument angle, which is all you need to know to determine the final answer from the sine of the reduced angle. Still pretty simple.

The difficulty stems from the fact that /2 is an irrational number. You can't just compute /2 to machine precision, multiply it by the number of adds or subtracts you want to do, and add it to the argument. The actual product requires an infinite number of fraction bits to represent correctly. More to the point, once you cancel some of those high-order bits, you need more bits than you can store in a one-word product to compute the sum correctly enough. You need a full word of bits to the right of the binary point in addition to however many bits you need to cancel in the argument to the left of the binary point.

What this means is, the bigger the argument the harder you have to work to reduce it correctly. An IEEE-754 double value has 53 bits of precision and can have roughly 2¹⁰²⁴ bits to the left of the (floating) binary point. So you have to do arithmetic to upwards of 1024+53, or 1077 bits. That's 21-word precision! Now consider the 128-bit long double supported by a few machines. We're talking 113 bits of precision and 2^16,384 maximum value—146-word precision, no less. Just storing /2 to that precision can take over 2 KB. Yuck.

It is tempting to believe that all this work isn't really worth the bother. After all, the bigger the argument, the coarser the steps between representable angles. Surely the sine is not worth computing to full precision if the angles you can represent aren't all that closely packed. If nothing else, consider what happens when the steps between representable values become greater than 2*. Surely the sine is meaningless if a 1-bit error in the argument can jump more than one revolution.

Nice theory, but it's not true. In point of fact, every finite floating-point value corresponds to an exact angle. The mere fact that it's hard to compute that angle precisely enough to determine its sine doesn't really let the implementor off the hook. And the coarseness argument doesn't hold water either. The exponential function gets just about as coarse as the sine as arguments get larger. But it has two saving graces: 1. Only a small range of arguments have representable exponentials, since the function grows so fast; and 2. it's still easy to compute even for large function values.

If you see a program that computes sin(753.2), it's probably ill designed, and the result is probably not very meaningful. But the problem we implementors face is that neither of these things is certain. So we have to work ever harder to compute a good sine function, even as the odds of our work being meaningful steadily decline. Nasty validation suites will go out of their way to test these extremes, and nasty users will complain if all this extra work slows down the obvious sine computations or makes the code excessively large. I'd love it if the C Standard let us off the hook for large arguments to sine, but it doesn't.

So I keep trying to explain the "sine problem" on the newsgroups. And here. I'm sure I've failed with many readers once again. But I'll keep trying.

P.J. Plauger
Senior Contributing Editor
pjp@plauger.com