June 2000/A Reusable Nonlinear System Solver, Part 1

Algorithms

A Reusable Nonlinear System Solver, Part 1

Michael L. Perry

C++ lets you encapsulate the Newton-Raphson method for solving equations without obscuring the underlying physical problem to be solved.

Modeling a physical system is difficult. Typically, an engineer gives the developer equations that relate some attributes of the system to others.

"So you want the program to calculate x; this equation says I need y and z. How do I calculate them?"

"Here are two more equations."

"But these equation both depend upon x. How am I supposed to find it?"

"Use the first equation. Oh, and this only refers to one object. The system will contain several that interact."

"How do they interact?"

"That's defined at run-time. Good luck."

Too often, the developer first tries a simple iterative attack, in which the system is evaluated at an initial guess, and the results imply the next guess. The assumption is that such methods converge on a solution. On test data they sometimes do — provided that the initial guess is close enough. However, when this method is applied to real data, it often fails, leading to frustrated customers and long debugging sessions.

Instead of taking the simple approach and hoping for a solution, I prefer to do a little research and a little math to find a proven solution. Research, in this case, leads to Numerical Recipes in C, Second Edition, by Press et al [1]. In Part 1 of this two-part series on mathematical modeling, I will discuss the Newton-Raphson method for solving systems of nonlinear equations. The book explains the math in great detail, so I give only an overview here. I focus on wrapping the algorithm in a set of C++ classes that can be reused across problem domains.

Requirements for a Solvable System

As Press explains, not every set of nonlinear equations has a solution. You must carefully examine the mathematical model to determine whether a solution exists and is easily found. A full discussion of proving solvability is beyond the scope of this article, but you can apply a few criteria to be relatively sure that a solution exists.

The first test you should perform is to determine if there is a practical reason that a solution should exist. This is easy, because the customer usually wants the solution to some physical system. If the customer didn't reasonably expect a solution to exist, she probably wouldn't have asked us to find one in the first place. If we are modeling a hydraulic pipeline system, for example, we know that fluids will flow under normal conditions. If, however, the physical system we are modeling has some slack or ambiguity, the equations are not likely to solve.

The second test for solvability is to look for discontinuities. Look at the equations for conditions that would cause a denominator to go to zero, or would cause you to take the log or square root of a negative number. Make sure that valid inputs stay far away from these places. You may even wish to go so far as to plot the equations, to look for either sharp changes in slope, or oscillations. The most solvable equations are those that always increase or decrease, never change direction or flatten out.

The third test is to make sure there are exactly as many unknowns as equations. An unknown is a quantity that reacts to other quantities in the system. All of the quantities that you wish to find are unknowns. In addition, all intermediate quantities that change with the system are unknowns. Values that remain fixed are not unknowns; they are parameters. Their values are given before the solver even starts its work.

The Newton-Raphson Method

If a system of nonlinear equations passes these three tests, it is likely that Newton-Raphson will find a solution. You must first transform each equation so that all that remains on the right-hand side of the equals sign is zero. The easiest way to do this is to simply subtract the right-hand side from both sides.

Newton-Raphson represents the set of unknowns with a vector (a single column of a matrix) called x. It similarly represents the set of functions evaluated at x with another vector y. To approximate the non- linear system, the algorithm chooses a linear system that has the same slope at the initial guess. Taking the partial derivatives of each function with respect to each unknown, we can build a matrix that satisfies this condition. This is called the Jacobian matrix, J. If you deviate from the initial guess (x) by a small vector (Dx), the nonlinear system (y) will respond in exactly the same way as the following linear system:

JDx = y

Figure 1 illustrates this situation for a system of just one equation. Since the goal is to find x such that all elements of y are zero, Newton-Raphson chooses Dy = -y as the initial guess. It then solves the linear system:

JDx = -y

for Dx, and from there calculates the new guess for x, which is old x + Dx.

This solution to the linear system is not the solution to the nonlinear system, but it is probably closer. Newton-Raphson repeatedly finds the next guess in this manner, and thus iterates to a solution.

Line Searching and LU Decomposition

Even if Dx does not get us sufficiently closer to the solution, it always starts out by getting better. Remember that the linear system behaves just like the nonlinear system for a small deviation. Therefore, even if the full step defined by Dx does not get us closer, we can be sure that it at least points in the right direction.

To take advantage of this fact, Newton-Raphson performs a line search. It first tries the full step. Failing that, it tries smaller fractions of Dx. When it locates a sufficiently better guess, the algorithm stops line searching and uses the new guess in the next iteration. The method used to determine each fraction of Dx to try is well covered in Press, page 384.

To quickly solve the linear system, the engine uses LU decomposition. This technique replaces one square matrix with two triangular matrices. (A triangular matrix is a matrix that contains all zeroes either above or below the diagonal.)

It is easy to solve the two systems of equations that result:

Because the matrices are triangular, each successive row adds only one coefficient. Therefore, using the values you've already found, you can easily calculate the next. A fast method for decomposing a matrix into its lower and upper triangular components is found in Crout's algorithm. For the details, please see Press, page 44.

The Solver Engine

The engine uses Newton-Raphson to solve an application-defined system of equations and unknowns. The application packages the equations and unknowns in a way that the solver engine understands. Several methods of packaging could be considered, but the right one will make the solver engine more reusable.

One option is to store the values of all unknowns in a vector and pass it as a parameter into a function. The function evaluates all of the equations, and returns their values through another vector. The application provides a function pointer, as in the following code fragment:
bool Solve( CVector (*pFunction)( CVector x ), CVector &x );
CVector EvaluateSystem( CVector x );

void main()
{
   CVector x(nNumUnknowns);
   Solve( &EvaluateSystem, x );
}
This method has three significant problems. First, the identity of the unknowns is lost. The evaluation function must extract the values of the unknowns from a vector by index. Second, context information is lost. Because the evaluation function is passed using a function pointer, extra information such as parameters must be stored in global variables. Third, the function can provide no extra information about the system. For example, it is difficult for a function pointer to identify the unknowns upon which each equation depends.

Instead of packaging the equations and unknowns in this manner, the solver engine packages them as interfaces. The application implements these interfaces with objects and passes pointers to them into the solver. This method lets an unknown retain its identity, since the application provides a pointer to its own data. This method also lets the system retain context information because objects are behind the interfaces, not just functions. Finally, this method lets the system provide more information, because the interfaces can expose more than one member function.

The declarations of the two interfaces appear below:
class IEqUnknown
{
public:
   virtual double GetValue() = 0;
   virtual void SetValue(double dValue) = 0;
};

class IEqEquation
{
public:
   virtual double CalculateY() = 0;
   virtual bool DependsUpon(IEqUnknown *pUnknown) = 0;
};
The application implements these interfaces and adds unknowns and equations to the system. It then calls the Solve member function, which applies Newton-Raphson to the provided system to achieve a solution. If a solution is found, the solver leaves the values of the unknowns in the provided objects.

To solve the system of nonlinear equations, the engine must first solve a system of linear equations (as the Newton-Raphson method states). Three components — the matrix, vector, and LU decomposition classes — together represent a linear system of equations. The matrix is a square container of doubles, while the vector is a vertical column of doubles.

The LU decomposition class represents the decomposition of a matrix. Given one vector y, it finds the vector x that solves the linear system Mx = y.

Optimizing the Solver

The matrix and vector classes make use of operators to improve code readability. However, the heavy use of operators means that these matrix and vector objects are often returned from functions. Each time this occurs C++ calls the copy constructor. If the result is stored in a local variable, this process is also accompanied by a call to the assignment operator. For a large matrix or vector, frequent copies can seriously hurt performance. To improve performance without affecting readability, the matrix and vector classes use copy on write, or lazy copy.

The actual data for either a matrix or vector is stored in a separate class. The data class keeps a reference count so that many objects can use the same copy of data. When an object stores the pointer to data, it increments its reference count. When it no longer needs the pointer, it decrements the reference count. Before an object makes a change to the data, it checks the reference count to be sure it is not shared. If it is, the object makes a copy first. Using these rules, the temporary objects created by C++ to accommodate the operators exchange a data pointer, instead of copying a large array of doubles.

The LU decomposition object stores both the lower and the upper triangular matrices in one square matrix. This is possible because the diagonal of one of the matrices is known to contain only ones. The LU decomposition object also stores the row permutation used to pivot the original matrix. It calculates these members in its constructor when it decomposes a matrix, and uses them to solve the system for a specific result vector.

A Typical Problem

To demonstrate the use of the solver engine, I present the solution to the triangular lattice of springs shown in Figure 2. The system, composed of springs connected to cords anchored at fixed points, follows a few simple rules. The change in the length of a spring is proportional to the force applied to it. The length of a cord is constant. The forces at opposite sides of the spring are exactly equal in magnitude and opposite in direction. The force on a spring or a cord pulls along its length. The entire lattice rests on a tabletop, so motion is permitted only in two dimensions and gravity is not a factor.

The following equations capture these rules:

1) p1-a1 = (F1-F3)/|F1-F3| L1
2) p2-a2 = (F2-F1)/|F2-F1| L2
3) p3-a3 = (F3-F2)/|F3-F2| L3
4) p2-p1 = F1/|F1| (R1 + k1|F1|)
5) p3-p2 = F2/|F2| (R2 + k2|F2|)
6) p1-p3 = F3/|F3| (R3 + k3|F3|)

Equations 1), 2), and 3) represent the three cords, and equations 4), 5), and 6) represent the three springs.

The first criterion that the system must meet is that we can reasonably expect the physical system that it models to have a unique solution. Provided that the cords are taut, the physical system will quickly reach equilibrium. If, however, the vertices are moved too close together or the cords are too long, the system will have some slack, and it will not reach a unique solution.

The second criterion is that the unknowns stay far away from any discontinuities in the equations. Close examination of the equations reveals that division by zero occurs when the magnitude of a force is zero. Again, this indicates that a system with slack will not solve.

The third criterion is that the system have as many equations as unknowns. In the above six equations, the three positions (p1, p2, p3) and three forces (F1, F2, F3) are unknown. Since each force or position is a two-dimensional vector, the system has twelve unknowns. Similarly, each of the above equations relates two-dimensional vectors, so the system actually has twelve equations. The solver engine can easily find the positions and forces of the springs in this system, provided that the parameters are sufficient to remove all slack.

A word of caution, you may be tempted to multiply both sides of each equation by the magnitude of the force, thus eliminating the possibility of divide by zero. Notice, however, that by doing so, you create a new solution for each equation, namely where the force is zero. In trying to eliminate the singularity, you've actually turned it into a solution. Depending on the initial guess, the solver engine is just as likely to find this false solution as it is to find the true one.

Representing the Problem in Code

One composite object represents the entire lattice. Its members include three spring objects, and three cord objects. Examining the equations above reveals the unknowns that any single object relates. A spring object connects two points and one force, as represented by p1, p2, and F1 in Equation 4. A cord object connects one point and two forces, as represented by p1, F1, and F3 in Equation 1. Since the objects share the positions and forces, these unknowns are also members of the lattice object. The object instance diagram appears in Figure 3.

The CProbVector class represents the two-dimensional vectors used for positions and forces. Please don't confuse this vector with the multidimensional vector used in the solver engine. A problem vector represents two unknowns, x and y. The CProbVector class adds two unknowns to the system, so it implements the IEqUnknown interface in two different ways; it must inherit the IEqUnknown base class twice. According to the C++ specifications, a class cannot appear more than once in the direct base list. Therefore CProbVector must inherit indirectly from IEqUnknown. It does this through a pair of intermediate classes, IEqUnknownX and IEqUnknownY.

CProbVector cannot unambiguously implement GetValue and SetValue directly. If it tried to, the compiler would attribute the same implementations to both IEqUnknownX and IEqUnknownY. To disambiguate these methods, the intermediate interface classes inherit IEqUnknown and redirect its methods to new pure virtual functions. Figure 4 shows the class diagram for CProbVector and related interfaces.

The declarations of the intermediate interface classes and the CProbVector class appear in Listing 1. CProbVector passes a pointer to each of these interfaces into the system in its AddToSystem method, shown in Listing 2.

The CProbCord and CProbSpring classes add two equations each to the system. They use the same technique as CProbVector to implement two interpretations of the IEqEquation interface. To pull the entire system together, the lattice gives references of the points and forces to the spring and cord objects through their constructors. This way, all equations refer to the same unknowns; identity is preserved. The main function simply creates a solver engine and the sample lattice illustrated in Figure 5. It tells the lattice to add itself to the solver, then tells the solver to go. If a solution is found, it tells the lattice to print it out.

The sample code includes four sets of initial guesses, each further from the solution than the one before. When testing a new nonlinear model, it is best to generate a problem with a known solution, then try progressively worse initial guesses. This first lets you confirm that the equations are calculated properly, then lets you get a feel for how close the initial guess must be for the system to solve. For most systems, you will find that this solver engine works well even when the initial guess is bad.

Conclusion

Most developers, when given a system like the lattice to solve, write code specific to the problem and tweak it until it works. Instead of working the problem, I worked a solution. I created a general solver engine that can be reused across problem domains.

The solver engine is reusable because it is based on sound mathematics, and because it uses interfaces to represent equations and unknowns. The problem object, however, is specific to a triangular lattice. It is presented here only to demonstrate the use of the solver engine. In Part 2, I will present a more general pattern that you can apply to almost any network of objects. Not only can it be used to model a network of cords and springs, but also electrical circuits, hydraulic pipeline systems, and traffic simulations. No matter which problem domain you model with the network pattern, it uses the same general solver engine presented here.

Reference

[1] William H. Press, Saul A. Teukolsky, William T. Vetterling, Brian P. Flannery. Numerical Recipes in C, Second Edition (Cambridge University Press, 1993). ISBN 0521431085.

Michael L. Perry is a software developer with six years of industry experience. As the founder of Mallard Software Designs, Inc., Michael strives to make sound development practices more commonplace in the software community. For more information, please visit the Mallard website at http:\\www.mallardsoft.com.