July 1992/Simulated Annealing for Optimization Problems

Features

Simulated Annealing for Optimization Problems

Michael Percy

Michael Percy is a Senior Systems Analyst with Information Systems Development at Clemson University. He has been programming in C for five years. He is currently finishing his thesis for an MS in Computer Science.
Combinatoric problems are typically easy to describe, but impossible to solve whenever they involve more than a very small number of elements. For real-world problems, the potential number of solutions, any of which could be the optimal solution, is huge. In most cases, however, it is not necessary to determine the optimal solution — a near-optimal solution will suffice if it can be found quickly. This article illustrates how a near-optimal solution can be found using a statistical mechanics technique called simulated annealing.

Simulated Annealing
In the early 1980's Kirkpatrick, Gelatt, and Vecchi [1] brought statistical mechanics techniques normally applied to condensed matter physics to bear on the problem of combinatoric optimization. Statistical mechanics is a body of methods for analyzing the overall properties of large numbers of atoms found in samples of liquid or solid matter. One of the interesting questions addressed by this field is "what happens when a sample is melted and then cooled?" Will the material solidify, and if it does will it form a crystalline solid or a glass?
This solidification process is called annealing and involves slowly lowering the temperature and then holding the temperature near freezing for a long time, giving the atoms adequate opportunity to align themselves in low energy configurations. If the system is cooled too quickly, little or no alignment occurs, and the result may be a crystal with many defects or a glass with no crystalline structure and only locally optimal structures.
The Metropolis procedure [2] used by Kirkpatrick, et al., dates back to the early 1950's and is a simple algorithm that can provide a simulation of a collection of atoms. At each step, the position of an atom is randomly changed by a small amount. The resulting change in the energy of the system is computed. If the new energy state is not greater than that of the previous one, it is accepted since the goal is to minimize the energy of the system. If the new configuration is at a higher energy than the previous one, there is still a probabilistic chance that it may be retained anyway. This probability, P(DE), is exp(-DE/kB * T), where DE is the change in the energy level, kB is called Boltzman's constant, and T is the temperature. T will always be many orders of magnitude larger than DE, and kB serves to bring the ratio of DE/T into a more reasonable range. The exact value of kB is unimportant for this discussion. Because P(DE) is close to 1.0 at high temperatures, atoms may move freely. As the temperature drops, P(DE) approaches 0.0 and the system is much more constrained.
Repeating this step many times will effectively simulate the thermal motion of the atoms. Once the atoms reach a point of equilibrium at the current temperature, the temperature is lowered and the process is repeated. Once the temperature reaches the freezing point, all the atoms should be in their optimal (lowest energy), or nearly optimal, positions.
It is the probabilistic aspect of the algorithm that allows the atoms to move to their appropriate places, even though that motion may require the energy of the system to go up slightly. Notice that very large increases in energy are almost always rejected, but small increases likely will be accepted. As the temperature decreases to the freezing point, even small increases are not likely to be accepted. If the system has come to a metastable state at each temperature, and the energy level has decreased at each point, then as the temperature reaches freezing all, or nearly all, of the atoms are in place, and any increases should be rejected. It is interesting to note that if T=0, no increases are allowed and the algorithm is equivalent to incremental improvement.
The analogy between a physical system and an optimization problem is not complex. To use the Metropolis algorithm and simulated annealing to solve combinatoric optimization problems, replace the energy function with a cost function and the product kB * T with a control parameter. There are a few requirements that are usually easy to meet. There must be:

a way to represent configurations

a way to randomly permute one configuration into another

a computable cost function.
Additionally a replacement for the physical "annealing schedule" must be contrived; multiplying the current temperature by 0.95 to get the next lower temperature works well enough. Finally, the program must be able to determine when the system has reached a metastable state; counting successful and unsuccessful permutations seems to work quite nicely.

Implementation
A classic problem drawn from operational research, called quadratic assignment or the placement of facilities problem, involves locating a set of plants at a set of sites, in a way that minimizes total transportation costs, given delivery requirements from plant to plant and unit transportation costs between sites. There is a canonical data set for this problem found in Nugent, et. al [3], with sizes from five plants to thirty plants. A number of heuristics (see accompanying sidebar) have been applied to this data, and for the smallest problems exact solutions are known.
To apply simulated annealing to this problem, create an array that represents the n sites (numbered 0 through n-1). The n plants are also numbered 0 through n-1 and you can represent a solution by placing the numbers 0 through n-1 in the array. That is, sites[i] == p implies that plant p is located at site i. Note that any permutation is a valid solution, so you can generate a new solution from an existing one by swapping two plants at random. Once a solution is generated, compute the cost of all deliveries that must be made between the various plants (at unit costs that depend on the plant's site). To determine whether the system is metastable, keep track of how many solutions are accepted and how many are rejected. If either number goes above a given threshold, consider the system metastable. Once no more improvements are made over a few decreases in temperature, the system is frozen.
The maximum accept and reject thresholds are best left as control variables along with the temperature control variable, T, and the annealing schedule variable, a. Finding appropriate values for these parameters, as well as the initial temperature, can be tricky. If the initial temperature is too high, the system wastes time by allowing any random change to be accepted. If a is too large, the system cools too quickly and is effectively behaving as an incremental improvement algorithm. If a is too small, redundant changes may be attempted. The accept and reject counts cause similar behavior. If they are too high, the system spins its wheels trying redundant combinations. If they are too low, the system cools too quickly. Try different combinations to get a feel for choosing them.
Figure 1 contains the pseudo-code for using simulated annealing. I chose to retain some of the physical system nomenclature to illustrate the connections between the physical system and the combinatoric problem.

Conclusions
Listing 1 contains the actual code. Table 1 lists the original results from Nugent, et. al. [3], previous best known results, and results from the simulated annealing. Figure 2 and Figure 3 show a sample of input and output for the program. The input data set comes from [3].
While simulated annealing did not always get excellent results, only a small number of runs were needed to equal or exceed the previous known best results. In fact, it only took two runs for the n=30 problem to achieve the result 3062. Even on runs with worse results, the solutions were only a few percent off of the best solutions.
Simulated annealing has been shown to be very effective at quickly producing optimal or near optimal solutions for a large number of problems: location of facilities, traveling salesman, physical design of computers (chip and wire placements), graph coloring, image processing, register allocation, CPU load balancing, and many other problem areas. It is a simple technique that can be applied to optimization problems easily and with high expectations about the quality of the solutions generated.

References
[1] Kirkpatrick, S., Gelatt, C.D. Jr., and Vecchi, M.P. 1983. "Optimization by Simulated Annealing." Science (May): 641-680.
[2] Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E. "Equation-of-State Calculations by Fast Computing Machines." Journal of Chemical Physics. 21:1086-1092.
[3] Nugent, C.E., Vollmann, T.E., and Ruml, J. 1968. "An Experimental Comparison of Techniques for the Assignment of Facilities to Locations." Operations Research 16. January: 150- 173.