February 1992/Curve Fitting By Chebyshef And Other Methods

Features

Curve Fitting By Chebyshef And Other Methods

Dr. Timothy Prince

Tim Prince has a B.A. in physics from Harvard and a Ph.D. in mechanical engineering from the University of Cincinnati. He has 25 years of experience in aerodynamic design and computation. He can be contacted at 39 Harbor Hill Dr., Grosse Pointe Farms, MI 48236.
The problem of fitting a curve to data as a convenient means for interpolating values has almost as many different solutions as there are people trying to find solutions. While there are accepted methods (such as Fourier transforms) for certain specialized applications, the field is wide open for many common applications.
Some of the most common methods fall in the regression category, where you pick a form of curve, perhaps a polynomial of degree N, and fit it to the data by least squares. Typically, you may have little idea whether the family of curves you selected is appropriate for the application. Choosing too high an order of polynomial makes the shape of the curve too sensitive to uncertainties in the data points. To avoid the problem you can fit successively higher order polynomials, using statistical tests to decide where to stop. Least squares methods cannot fit a curve with much more than half the precision of the underlying computer arithmetic operations.
Vendors of CAD systems have found their own ways to fit curves and surfaces to geometrical data. Their problem is complicated by the need to deal three dimensions, but the methods are generally based on two dimensional curves. The B(ezier) splines are popular in this field, but require a certain art in finding ways to develop and apply the software. No doubt the experts see the artistic content as an advantage. Obviously, satisfactory surfaces can be drawn by interactive graphic methods. On the other hand, blindly shoving data in is likely to give poor results.

Cubic Splines
Typical engineering software uses linear or spline curve interpolation between data points. The standard spline methods fit a curve with continuous slope and curvature passing exactly through each data point (referred to as knots). Appropriate conditions must be chosen for the first and final data points, and this is often difficult to do. General purpose cubic spline functions, such as the published de Boor method, offer a choice of three types of end condition:

specified slope

specified curvature

"not-a-knot" smooth ends
The specified curvature condition is often misused. Zero curvature at the end is called the "natural" condition, with reference to the physical spline analogy, but that doesn't mean you should use it. If there is no way to know what the curvature should be, the "not-a-knot" end condition is usually better. This adjusts the end curvature and slope to give the smoothest possible curve. In general, the third derivative of the spline (analogous to shear stress in a beam) changes abruptly at each knot. With the "not-a-knot" end condition, this discontinuity is eliminated at the knot next to the end, analogous physically to putting torque on the end of the spline so that it does not need restraint to pass through the knot next to the end.

Spline Software Issues
Even with intelligent use of splines, problems remain. When all the end condition options are included, it may be difficult to figure out how to structure code. Much published software is less efficient than it could be. Consider the spline evaluation code from Numerical Recipes In C (Press, Flannery, Teukolsky, and Vetterling

a= (xa[khi]-x)/(h=xa[khi]-xa[klo]); b= (x-xa[klo] )/h; *y= a*ya[klo]+b*ya[khi]+((a*a*a-a)*y2a[klo] + b*b*b-b)*y2a[khi])*h*h/6;
and the algebraic equivalent

b= x-xa[klo]; a= xa[khi]-x; h= xa[khi]-xa[klo]; *y= ((ya[klo]-y2a[klo]/6*b*(a+h))*a+(ya[khi] -y2a[khi]/6*a*(b+h))*b)/h;
The second version saves four operations and the corresponding roundoff errors.
The spline fit function in Numerical Recipes In C has a problem with a potential aliasing of array elements which can force the compiler to generate less efficient code. The problem can be alleviated simply by changing the order of lines in the code. The prototype reads

void spline(float x[] float y[],int n,float y2[])
Only the array y2[] is to be changed by spline(). But the C compiler does not know that the array arguments do not overlap, so the compiler must assume that any change to y2[] could also affect x[] or y[], preventing it from reusing intermediate values. Therefore, the assignment to y2[i] should occur last in the loop body. This change allows the compiler to perform normal optimizations, short of taking advantage of loop unrolling. Such aliasing is an issue for any C compiler trying to combine common subexpressions.
Intermediate results can be saved where they will be used again on the next step. Since the algorithm is recursive anyway, the additional recurrences will not pose an obstacle to any known compilers, as they might in parallelizable code. The improvement in speed is not as large as might be expected; sequential processors do not have enough registers and must use memory storage of these temporaries, and pipeline processors can handle many of the duplicate floating point operations by increasing parallelism.

Use Of Splines
You may wish to use the fitted curve to find the slope of the data. Averaging the slope of the cubic spline curve and the slope of the linear interpolant gives a good estimate at points half way between knots, but this slope does not match the slope of any smooth curve through the data. As end slope is determined by your choice of end condition, slopes closer to the end points are as unreliable as extrapolating the curve beyond the end points.
Integration of the spline fit to find the area under a curve gives slightly over half the error of Simpson's rule integration using the same data points. This improvement is hardly worth the extra effort unless it is impossible to get the data spaced appropriately for Simpson's rule.
A cubic spline curve exaggerates errors or uncertainty in the data points. Usual ways of dealing with this are to use the "spline in tension" method, or to put "springs" between the curve and the knots and allow the curve to adjust to greater smoothness by deviating slightly from the input data. Tension or stiffness of the springs is an arbitrary parameter, requiring human interaction to get a visually satisfactory result. Repeated smoothing keeps on changing the data; unlike the Chebyshef smoothing, the adjusted splines never produce a curve which doesn't want to be adjusted further.
In practice, there is a fairly low limit (around 15) to the number of points which can be spline fit reliably without provision for tension or data adjustment. Demanding geometric problems, like fitting the contour of an airplane wing, can be handled by fitting splines in several segments.
A tactic which may be useful for geometric curve fitting by splines or any other method is to parameterize x, y, and z as a function of stair step distance. Ideally, the arc length along the curve would be a parameter; the stairstep distance is much easier to obtain. Fitting two or three functions of one independent variable is not much more time-consuming.

Chebyshef Polynomials
Chebyshef polynomials provide a well founded way of fitting a curve which approximates a function within a finite interval. The method seems almost too magical to trust, requiring less arbitrary input and often giving more satisfactory results than alternative methods. Dropping a term from a Chebyshef fitted polynomial introduces errors which are nowhere larger than the associated coefficient, so you can easily see which terms to keep to reach the desired accuracy of curve fit. When additional terms don't yield further convergence, it is good evidence that the background noise in the data or the limited accuracy of the arithmetic is swamping the remaining terms.
The theory is covered well in the textbooks, and a reliable set of C functions are presented in Numerical Recipes In C. I will concentrate on ways to use Chebyshef fits, and some implementation issues which should be addressed for best results.
Determination of Chebyshef coefficients requires the function values to be supplied at points dictated by the method, where the polynomial will match the specified values. Using specific points may seem awkward when the control points consist of arbitrary data at arbitrary locations, but Chebyshef fitting has been applied successfully by using linear interpolated values. Placement of the input points is less critical than for splines, although, since the function evaluations are concentrated toward the ends of the interval, the data are weighted more heavily there. The number of function evaluations needed for Chebyshef fitting is the minimum required to specify a given number of polynomial terms, so this method is as economical as any for functions which are expensive to evaluate. A dozen Chebyshef terms are likely to be sufficient for close approximation to any smooth curve.
The authors of Numerical Recipes in C advise against converting the Chebyshef polynomial into an ordinary polynomial before evaluation. This advice makes sense when there will be few evaluations of the polynomial, or when little is known about its behavior. Existing software often violates this advice, possibly because Clenshaw's algorithm was not well known until recently. Clenshaw's algorithm is 30 percent faster than evaluation by more obvious methods on sequential processors; even with careful coding for best performance, the difference drops to 20 percent on typical pipeline processors, but the small advantage in accuracy remains.

Chebyshef Performance Tuning
The routine for calculating the Chebyshef coefficients consumes the most time and generates most of the roundoff error when all arithmetic operations are done at the same precision. cos() is evaluated with arguments up to n*PI for a Chebyshef fit of order n. If the arguments have a relative error of order DBL_EPSILON, the results may be in error by order n*DBL_EPSILON. Summing up the results introduces smaller errors, which are still significant if accurate results from cos() are obtained. If these roundoff errors can be controlled, the overall roundoff errors in the Chebyshef fit and evaluation can be held to about DBL_EPSILON. Unfortunately, most compilers fail to take advantage of the ability of 8087 or 68881 processors to hold the errors generated in the inner loop to the order of n*DBL_EPSILON. Even without extra precision, the accuracy of Chebyshef fitting is far better than least squares methods.
On co-processor architectures which have a built-in cosine function, you should avoid going through subroutine calls that narrow arguments and results to double precision. The greatest accuracy improvement comes from use of a long double value for PI. If the system does not support true long double constants, you may be able to generate a long double value which can be obtained by tricks such as

#define PI (sqrt(3.125*3.125)+.01659265358979323846)
Putting a function call in the expression prevents most compilers from performing the arithmetic and narrowing the result. Using a long double value for PI reduces the roundoff errors, typically by a factor of two compared to errors generated by using a normal double value for PI. Roundoff error may be reduced by an additional factor of three by taking the assembly code output from the C compiler and changing the cos() calls to direct numeric processor instructions. Sun's inline function compilation option doesn't do the job, since it still requires "arguments" and "returns" narrowed to double and pushed on the stack. With full use of register variables, the 8087 and 68881 families are excellent for Chebyshef analysis.
Pipelined processors should have a pipelined version of cos(), since it should be possible to calculate a group of four cosines as fast as two individual cosines. An intermediate level of performance may be obtained by supplying source code for cos() and using an inlining compiler option. While cosf() can be handled by a macro, the greater number of terms in cos() makes such an approach impractical. The subject of group math library functions could fill another article.
The recursive algorithms for evaluation of Chebyshef polynomials can be speeded up on superscalar architectures by determining which array element is needed first, and writing the code to fetch that element prior to the loop iteration in which it is needed. This is particularly true of the Chebyshef derivative coefficient evaluation. As presented in Numerical Recipes In C, this function again suffers from aliasing, since the compiler cannot be sure that the read-only input array c[] will not be overwritten. In order to keep the pipeline running, the potentially aliased array element has be fetched before the results of the current iteration are stored. In addition, all but two iterations of the second loop can be moved into the first loop where they will fill otherwise empty pipeline slots. There are enough registers even on an 8087 to eliminate all redundant memory references, further reducing the aliasing problem. Listing 1 shows the details.
These changes are detrimental to readability but can triple the performance and reduce the size of the generated code. Overlapping of stages of a sequential calculation from different loop iterations is reminiscent of CDC 6600 style code. In accordance with the old saw, a FORTRAN programmer can write FORTRAN in any language.
The aliasing problem could be eliminated by storing results in a temporary array, then copying this array to the destination. As long as a loop does not involve reading and writing different external arrays, with one array accessed via a pointer, there is no aliasing problem. The temporary vector solution is common for vectorizable code, but in simple problems involving recurrence, better results can be obtained with pump priming methods as shown previously.

Chebyshef Economized Polynomials
A common end use of Chebyshef polynomials is in computational approximations for math library functions. In this application, the goal is to fit a polynomial with the minimum number of terms needed to give the required precision. Typically, the base interval is chosen to be symmetric about zero, so that there are only odd power terms. The interval will be one in which the Taylor series converges well, and a Chebyshef economized polynomial converges even better. The caution about possible inaccuracy of polynomial evaluation does not apply.
In order to get a series of sufficient accuracy to use in a library function, the whole Chebyshef fitting process has to be carried out in a higher precision. long double precision should be sufficient to calculate coefficients for double precision function approximations.
If a Chebyshef series is fitted directly to a function which has zeros, there will be errors near the zeros which are large relative to the local value of the function, even though they are less than the errors at the end of the interval. One way of getting around this is to subtract off the linear term of a series expansion around the zero, and make the Chebyshef fit to the remainder. Similar tactics might be applied where the function is known to be approximated by a more complicated function.
A better method for the standard library functions is to divide out the zero, so that the Chebyshef polynomial is fit to a function such as sin(x)/x. The Chebyshef polynomial will have only even terms, with the size of the odd terms calculated being an indication of numerical error. The end result will be an approximation which has less relative error over most of the interval than a minimax polynomial obtained by laborious iteration to minimize the maximum relative errors. The general strategy is to approximate a function by a known dominant term times a Chebyshef polynomial which takes care of the fine adjustments.
Most Chebyshef economized polynomials can be determined by straightforward programming using a system with long double arithmetic or a multiple precision package. In some cases, as with log((1+x)/(1-x)), it is better to derive a Taylor series which matches the form of the planned approximation and fit the Chebyshef polynomial to this series. Working out the cancelling terms algebraically rather than numerically avoids much of the roundoff error.
Normally, Chebyshef economized polynomials would be used only as a software development tool, to provide a best fit, most economical polynomial for quick evaluation of a function. If the number of evaluations of a curve is not great, or the convergence properties of a polynomial are unknown, it is better to use the Clenshaw algorithms for evaluation. Still, an economized polynomial might well be better than a least squares fit regression, and one might want to find such a polynomial for comparison.

Rational Polynomials
Rational polynomials form a better fit to some types of functions. A clue that this may be the case is an inversion symmetry, as in

exp(x) = 1/exp(-x)
or

tan(PI/2-x) = 1/tan(x)
Both of these functions are best fit by rational polynomials. Unfortunately, there is no accepted method for making such numerical approximations. You can usually guess the form of the series; since

tan(x) = -tan(-x)
the rational polynomial approximation will have odd power terms in the numerator and even powers in the denominator. An approximation for exp() should be of the form

(p(x)+q(x))/(p(x)-q(x))
with p having only even power terms and q having only odd power terms, in order to exhibit the required symmetry. This is equivalent to an approximation for tanh(x) of the form

q(2x)/p(2x)
which is a way to evaluate tanh() for small x without losing accuracy.
You could arrive at similar results by deriving continued fraction approximations and converting to rational polynomials, a process which might be educational but wouldn't help in terms of practical results. The coefficients obtained analytically for rational polynomials (for unbounded intervals) are likely to be far from optimum for a closed interval.
People who play with these approximations have their favorite iterative numerical methods for adjusting a reasonable guess at the coefficients to greater accuracy. These calculations take almost twice as many bits of precision as the desired final accuracy; you would be lucky to get an approximation accurate to 12 decimal places on an 8087. Even IBM 3081 quad precision is marginal for obtaining a rational approximation for tan(). The author has had satisfactory results from a method acquired from Ruzinsky, which runs 10 to 20 seconds on a VAX 8550. With a multiple precision software package, higher accuracy fits should be feasible but time-consuming.

Conclusion
Several popular methods for curve fitting have applications for which they are well suited; some have been used where they work less well. The simplicity of the popular spline methods is gained at a cost in reliability. Spline methods work well when the end conditions can be specified accurately, and the data points are accurate. Rational polynomials are best, in the author's experience, only when they model a known symmetry. The Chebyshef methods are effective and deserve wider use. Chebyshef methods can be used for smoothing and interpolating discrete data which are moderately uncertain, as well as for evaluation of fits to known functions. Recursive algorithms for accurate evaluation of Chebyshef polynomials can be organized for superscalar performance on pipelined architectures.

References
Deboor, C. A Practical Guide to Splines, Springer-Verlag, 1978.
Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T., Numerical Recipes In C, Cambridge University Press, 1988.
Ruzinsky, Steven A., "A Simple Minimax Algorithm," Dr. Dobb's Journal, #93, pp. 84-91, July 1984.

Sidebar: "Fundamentals Of Curve Fitting" by Steve Graham