Mike Stimpson has a B.S. in mathematics and physics from the University of Utah. He has seven years experience programming in C and Pascal. He can be contacted at 6108 S. Longmore Dr., Salt Lake City, UT, 84118.
Integration (finding the area under a curve) is one of the fundamental mathematical operations used in science and engineering. So when scientists and engineers start using computers, one of the basic tools that they need is numerical integration, also called quadrature.
The most basic approach to finding the area under a curve, that of approximating the function with a series of straight lines, then computing the area of the resulting trapezoids results in some error wherever the function is not a straight line. In addition, the error is worse where the function has a higher curvature (i.e., where the second derivative is bigger).
Improving the situation by using more, narrower regions (usually called steps) in the approximation, however, has two problems.
First, increasing the number of steps, while it gives more accuracy where needed, also increases the accuracy where it is not needed in the places where the function has less curvature. This increases the number of function evaluations required to calculate the integral. Some of that increase is sheer waste. Depending on the function, this may be very expensive. With some functions, such as the Riemann zeta function, that are defined in terms of infinite series, it is nearly impossible.
Second, and worse, this simple approach doesn't tell how many steps are needed to give a desired level of accuracy. You can improve the accuracy of the calculation by using more steps, but that doesn't really tell how much more accurate you have made things! For serious work, this is fatal. (You can determine the error bound if you know an upper limit on the second derivative for the interval of integration, but that may be very hard to determine for some functions. For other functions, it may be infinite.)
There are some fairly elaborate schemes to get around the second problem (e.g., Romberg's method, which uses two different step lengths, and compares the answers to get an error estimate). However, there is an easier way.
A solution to both of the problems is a way to measure the inaccuracy of the approximations on a single step. This says not only how accurate the calculation was, but also where to improve the accuracy by spotting steps where the error was too large and subdividing only those steps.
This sounds like a pipe dream (to know how inaccurate your approximation is, don't you have to know the right answer?), but it really can be done. The solution is to use two different approximations in the same region, and compare the answers.
The code in Listing 1 implements an adaptive quadrature[1]. It adapts to the function being integrated, subdividing steps where the curvature of the function is higher. It uses a three-point and a five-point Simpson approximation. While both are more accurate than the trapezoidal rule, the important point is that, for a small interval, the five-point approximation has at most one-sixteenth the error of the three-point approximation. (Small in this context is defined by the scale on which the fourth derivative of the function changes. Error in this context is really the absolute value of the error.) This makes the difference in the two approximations at least 15/16ths of the error in the three-point approximation. But the five-point approximation has at most one-sixteenth the error of the three-point approximation, so the error in the five-point approximation is at most the difference between the two approximations, divided by 15. That is, if V is the true value of the integral, V3 is the value produced by the three-point approximation, V5 is the value produced by the five-point approximation, E3 is the (absolute value of the) error produced by the three-point approximation, and E5 is the error produced by the five-point approximation, then
E3 = |V - V3| E5 = |V - V5| E5 E3 / 16so
|E3 - E5| >= E3 - E3/16or
|E3 - E5| >= E3 * (15/16)But if
E5 E3 / 16then
E3 >= E5 * 16so
|E3 - E5| >= E3 * (15/16) >= (E5*16) * (15/16)or
|E3 - E5| >= ES * 15That is,
E5 |E3 - E5| / 15This provides a way to place an upper limit on the error of an approximation to an integral over a step. If the error is too large on a step, you can cut that step in half to get a better approximation, without wasting effort dividing steps where the error is acceptable.The function adaptive (Listing 1) is called with a left endpoint, a right endpoint, a pointer to the function to be integrated, and a pointer to the allowable error. This is a pointer so that the function can return the value of the integral (as the function return value) and also return a limit on the actual error (through the pointer).
The function computes the two approximations and forms an error limit. If the error is acceptable, it writes the error to the error pointer and returns the value of the 5-point approximation (since that approximation is more accurate). If the error is too large, the function cuts the interval in half, calls itself twice (once with each sub-interval, and both times with half the original allowable error). It then adds the returned errors for the two sub-intervals, writes the error sum to the error pointer, adds the returned integrals of the sub-intervals, and returns that value.
Note that the left endpoint, right endpoint, and center point are being evaluated twice once when the interval is evaluated; and once in the recursive calls, when the half-intervals are evaluated. At the expense of some clutter in the code, Listing 2 fixes this. The function that does all the work is called adaptive2. It passes the evaluated values to itself, saving the repeated evaluations. The adaptive function is really a convenience, presenting a cleaner interface to the caller.
Note that this (or any other) numerical integration algorithm fails if the function being integrated is infinite at any point of the region of integration.
Reference
1. Ralston, A., and Rabinowitz, P. "A First Course in Numerical Analysis," McGraw Hill, 1978, section 4.11.