Over the last few years, we have been migrating a small library of numerical code originally written in Fortran, first to C, and then to C++. Like most people who undertake this journey, at first we didn't take full advantage of C++'s features. Although we wrote the occasional class, the overall style would have been familiar to a Fortran 77 programmer.
Recently, we have written differential equation integrators that truly demonstrate C++'s benefits in numerical computing. In this article, we present C++ classes to organize and protect data. The code uses inheritance to group similar data structures and procedures into families that share some elements and that are, in appropriate contexts, interchangeable. It also uses templates to express type-independent algorithms, minimizing the number of special cases to be independently coded. The result is tight, readable code that is easy to maintain and extend.
A differential equation is a relationship between a variable and its derivatives. The very simplest differential equations (first-order ordinary differential equations) tell you how fast one thing changes when another is changed:
dx/dt=f(x,t)
Here, the dependent variable x might be the mass of algae in a lake, a vector of concentrations of chemicals in a beaker, or a matrix of particle positions and velocities arising from Newton's laws of motion. The independent variable t might be time in the aforementioned applications, but in other contexts it might be a position, a voltage, or indeed any other scalar quantity with respect to which it makes sense to measure a change in x. The integrators we describe in this article solve what is known as the "initial-value problem," which is to find a function x(t) that both satisfies the differential equation and passes through a given initial value of x. We only deal with first-order ordinary differential equations (equations involving only the ordinary first derivative) since other kinds of differential equations are commonly reduced to this case for numerical solution.
The solutions of most differential equations cannot be expressed as simple, algebraic formulas. The ability to obtain approximate numerical solutions is therefore quite important. Routines that carry out the appropriate numerical algorithms are called "integrators." A scientists' toolbox must contain a number of different integrators since no single numerical method is appropriate to all differential equations. Inheritance from a well-designed base class makes it possible to write a suite of integrators with a common interface. Furthermore, because x can be a simple variable or a container with arithmetic operators (a valarray or matrix, for example), the integration algorithms are best expressed using templates.
A differential equation defines an evolution of continuously varying quantities. Unfortunately, digital computers operate on data in discrete steps. Therefore, algorithms for integrating differential equations must increment the time (or equivalent independent variable) in a series of small steps. In other words, the continuous variable t is replaced by a sequence of discrete values t0, t1, t2, and so on. The difference between any two subsequent values of t is the step size h. In general, the smaller the step size, the more accurate the numerical solution. For each ti, there is a corresponding value of the dependent variable xi. In the initial value problem, you are given an initial point (x 0,t0).
Listing 1 defines an Integrator base class. This base class describes the common features of differential equation integrators. Following common practice in physics, the current value of the dependent variable (x in the aforementioned differential equation) is called statevector. Since the statevector could be either a single value or a set of values stored in a container, we use a template parameter C to avoid forcing a particular representation for statevector and related quantities. Although the independent variable t (and the step size h) generally are stored as a double, in some applications long double or float types might be appropriate, so we used a template parameter T for the type of the independent variable as well.
Users must supply a function that returns the value of f(x,t). The base class stores a pointer (return_derivs) to this function. return_derivs's arguments are values of the dependent and independent variables and a reference to an object in which the value of f(x,t) is stored. It might be more natural if return_derivs actually returned the value of f, but sometimes in numerical code aesthetics must be sacrificed for efficiency. In this case, the assignment of a returned object would require at least one copy operation per integration step. If statevector is a container, copying the contents can be expensive. The alternative method we use here lets return_derivs write elements directly into a preallocated container. This means that the container that holds the derivatives' values must be allocated (sized and, in the case of a matrix, shaped) before the first call to return_derivs.
Each class derived from Integrator supplies a function (step), which performs a single integration step. Making step a pure virtual function of the base class creates a common mechanism by which programs call the derived integrators. Similarly, the class provides the virtual function set_step_size to reset the value of h. A trivial definition of set_step_size is appropriate for some integrators, but not for others.
The base class also provides a pair of public utility functions, current_time and get_state, for reading out the state of the system. These functions and the constructor have straightforward implementations.
Euler's method, the simplest integrator in our suite, assumes that the rate of change of x is constant over the interval from ti to ti+1. As a result of this simple assumption, Euler's method tends to require small step sizes to give reasonable accuracy. Nevertheless, there are applications in which more sophisticated methods are less efficient. In Euler's method, the predicted value of xi+1 is calculated directly from the following formula:
xi+1=xi+hf(xi,ti)
The Euler derived class is easy to implement. Listing 2 shows both the class definition and the implementation of the Euler integrator. The class uses a private object (dxdt) to store the value of f(x,t). This object must be the same size and shape as statevector. Purely as an optimization, we keep dxdt around as a member object even though its value isn't "state" to be stored from step to step. This avoids the overhead of creating and destroying a temporary dxdt in every Euler::step, which can significantly degrade the performance of a program since statevector will often be an expensive-to-create container such as a valarray or matrix.
The inclusion of dxdt in class Euler creates a small problem: How do you construct this object? You can't default construct it because it must be the right size and shape to receive derivative information in Euler::step. However, C can represent different types. For instance, C could be double, valarray<complex<double>>, matrix<float> [1], or just about any other class with arithmetic operators. A double has no dimensions; a valarray has a single dimension, while a matrix has at least two. Short of requiring users to provide a shape transfer function between two objects of class C, we see just one solution to this problem copy construction of dxdt from x0 necessarily gives dxdt the size and shape of statevector. The particular values contained in x0 are not relevant, but since construction of Euler objects occurs much less frequently than invocation of Euler::step, the unnecessary copy operations have essentially no effect on performance.
Class Euler introduces an extra template parameter (B) not present in the Integrator base class template. If C is a container class, B represents the type stored in the container. If C is not a container, B and C are identical. B is used in Euler::step to cast h, which is of type T, to type B. Listing 3 shows an application in which this cast is necessary. The complex-valued equations given in this example arise in quantum mechanics [2]. The problem is that the valarray<B> template class only defines arithmetic operations with scalars of type B and sometimes you want mixed calculations, specifically in:
statevector+=B(h)*dxdt;
Since template classes require exact matching of argument types, operations between types T and valarray<B> are not defined if T and B are different. In this example, B is complex<double> and T is double. The compiler will not automatically convert h to complex<double>, making the explicit cast necessary.
While in principle it is always possible to use Euler's method; in practice, there are some problems for which it fails unless the step size is extremely small, making the run time exceedingly long. A different integration method is then needed. This section describes the implementation of a group of integration methods known collectively as "Gear methods" [3]. All the Gear methods are essentially the same, except for a single equation, which appears in an inner loop. For example, the rule for the second-order Gear method is:
xi+1=(4xi-xi-1+2hf(xi+1,ti+1))/3
The implementation of an integrator based on this formula raises at least two new issues: First note that the next value of the state vector, xi+1, appears on both sides of the equation. To obtain xi+1, this equation has to be solved either by the Newton-Raphson method [4] or by simple iteration. For simplicity, our code uses the latter method, although both approaches have their advantages [5].
Additionally, we need xi-1 (the previous value of the state vector) as well as xi (the current value) to calculate xi+1. This creates a bootstrapping problem since, in the first integration step, we only have x0. Calculating x1 by the second-order Gear method would require x-1, but there is no such point. Fortunately, the first-order Gear method doesn't require any previous points. If users request a second-order Gear method, the first-order method is used to generate the first point, and the second-order method can be used thereafter. Higher-order Gear methods require more past state vectors. These methods are similarly bootstrapped by working up from the lower-order methods. Ideally, all of this would be invisible to users.
Listing 4 shows the definition of class Gear. The past states are stored in a deque called past_state. Gear::step (Listing 5), in addition to updating statevector, manages the deque as well as a housekeeping variable (curr_order), which helps keep track of the Gear method to be used in the next integration step.
The Gear methods all expect that the previously stored state vectors are equally spaced in time. This potentially creates a data-coherency problem if the user wants to change the step size. This problem, which is common to many integration methods, is why we included the virtual function set_step_size in the base class (Listing 1). Gear::set_step_size (Listing 5) overrides Integrator::set_step_size to guarantee that bootstrapping issues are automatically addressed. Users don't have to worry about the details of the bootstrapping procedure because the constructor, Gear::step and Gear::set_step_size all work together to ensure that the integrator's internal data will remain in a consistent state.
The functions Gear1xnext, Gear2xnext, Gear3xnext, and Gear4xnext implement the Gear formulas of orders one through four. These functions are accessed in Gear::step through an array of pointers (calc_xnext) defined in the class template (Listing 4) with the help of a typedef. Although the syntax is not particularly intuitive, this technique has several advantages over some of the alternatives. In our original design of this class, there were separate step functions for each order of Gear's method. Most of the code in each of these functions was identical except for the equation iterated. Using a single step function that selects among the Gear formulas greatly reduced the amount of code, reducing maintenance headaches. Because the functions accessed through the calc_xnext array are member functions, they can access class data members directly, saving some copy operations. Finally, the use of an array of pointers to class member functions eliminates the need for an awkward switch or if...else statement in our integrator.
Listing 5 shows the code for two of the Gear formulas (functions Gear1xnext and Gear2xnext). These functions were hand-optimized to avoid the creation of temporary objects. The resulting code is not as readable as it might be, a necessary concession to efficiency. For instance, the obvious implementation of the first-order Gear formula:
xnext = statevector + B(h)*dxdt;
has the meaning:
xnext = operator+(statevector, operator*(B(h),dxdt));
which creates two temporary objects of class C one to hold the result of the multiplication, and one to hold the result of the addition. Our hand-optimized code for the first-order Gear formula (function Gear1xnext) requires no temporaries and runs about twice as fast as the obvious implementation in some of our tests [6]. It is not possible to completely avoid temporary objects while evaluating the higher-order Gear formulas (such as the second-order formula contained in Gear2xnext), but speedups by factors of two to three are still obtained by minimizing the number of temporary objects created.
Class Gear requires users to provide a function (pointed to by the class member approx_equal), which performs a comparison between elements of type C. This function determines the convergence of the iterative process used to solve the Gear equation. Convergence criteria are problem specific, so it is not desirable to build a universal comparison method into the integrator. Listing 6 gives an example of a convergence test corresponding to the differential equations in Listing 3. If the iterations do not converge, Gear::step (Listing 5) throws an exception. The calling program may then call Gear::set_step_size with a smaller value of h.
The user-supplied functions return_derivs and approx_equal are crucial to the performance of the program since they can be called millions of times during the solution of a problem. Their execution can therefore dominate the program's run time. Accordingly, they must be written with considerable care. For instance, it is sometimes preferable to operate on valarrays element-by-element (as we do in the convergence testing function in Listing 6) if the alternative requires the creation and copying of temporary valarray objects.
From the user's point of view, the integrators mostly differ in their constructors. Since they are all derived from a common base class, a program can select an integrator at run time, as in Listing 7. The rest of the program's code does not depend on the method selected. Dynamic binding of the virtual functions lets dSchrodinger->step() be called polymorphically.
Generic programming with templates lets users choose the most appropriate representation for their data. In addition to leading to more transparent code, this enables optimizations that might not otherwise be possible. For instance, traditional differential equation integrators are written as functions that force the dependent variable to be stored in an array. If the problem to be solved consists of a single differential equation in a single scalar variable (a one-element array, for instance), array-oriented code would waste some processor time managing unnecessary loop counters. In contrast, our code lets class C be anything from a simple double to a container class. Since the compiler sees the types, it can produce efficient code for any given case.
The integrators presented here demonstrate that C++'s features make it highly suitable for writing numerical code. We can imagine many extensions to our integrator suite. For one thing, none of our current integrator classes perform any error control. Our design makes it easy to add this functionality by deriving classes from Euler or Gear. Because these new classes would be derived from our Integrator base class, albeit indirectly, using them would require only trivial changes to programs that use our current integrators.
During the development of this code, we experimented with a number of variations on the themes presented here. At every step of the process, we considered three factors: efficiency, source code readability, and ease-of-use of the resulting libraries. We might tolerate compromises in the latter two areas if significant performance gains result, but we would be reluctant to make execution time our only criterion for judging a design. CPU time is a precious resource, programmer time more so. Writing code with acceptable performance while meeting our other design goals takes a good deal of experimentation. The effort is repaid when applications can be rapidly built from reliable, efficient software components.
[1] The C++ Standard Library does not include matrix templates. However, the Matrix Template Library (http://www.lsc.nd.edu/research/mtl/) provides a variety of matrix types.
[2] The differential equations in Listing 3 are a crude spatial discretization of the time-dependent Schrödinger equation for a particle in a square well. Each component of the valarray psi represents the value of the wave function at a particular point in space.
[3] Gear methods are particularly appropriate to differential equations that describe systems in which processes occur on vastly different time scales. Problems of this sort are common in chemistry. The sample code (available at http://www.cuj.com/code/) includes a classical model of an oscillating chemical reaction called the "Oregonator," which has this property.
[4] Michael L. Perry. "A Reusable Nonlinear System Solver, Part 1," C/C++ Users Journal, June 2000; "Part 2," August 2000.
[5] Practical Numerical Algorithms for Chaotic Systems, T.S. Parker and L.O. Chua. Springer-Verlag, 1989.
[6] A similar hand optimization of the Euler method (not shown) eliminates the creation of one temporary object, significantly improving the performance of this integrator.