Dr. Dobb's Journal August 1999
Graphs are commonly used to represent data and their relationships. They occur in many forms, including directory trees, flow charts, PERT charts, organization diagrams, and networks such as the World Wide Web. A graph consists of a set of nodes, which are used to represent the data items, and a set of edges, which are used to represent the relationships between items. Manually drawing a graph for a large data set, so that the graph looks good and properly accounts for the structure of the data, is an arduous process.
Automatic visualization tools generate comprehensible graphs from data, enabling users to see important relationships in the data. However, optimally placing the nodes (assigning coordinates to them, for example) and routing the edges between them (generating a line that connects the nodes) is a difficult problem. There are many criteria for judging a good drawing, including:
Each one of these criteria can be mathematically formulated as an optimization problem. However, computer science theory shows that finding an optimal solution for some of these problems is generally not feasible, and this is much more so for this combined optimization problem. Therefore, it is sufficient in practice to find good solutions instead. Finding good solutions rather than optimal ones is the objective of a whole research area called "graph drawing."
In this article, we describe a variant of a graph drawing algorithm called "spring-embedder," which models the graph as a physical system of particles that move under the influence of forces until the system reaches equilibrium. The result is a good, visually useful graph.
Spring-embedder is a force-directed layout. Its name results from mass-spring models in physics, in which a set of steel rings (nodes) is connected by springs (edges), and the system is allowed to reach equilibrium. This graph-drawing algorithm is particularly useful for undirected graphs, where the directions of the edges are not important. Typical applications of undirected graphs are the display of telecommunication network topologies and the Web. P. Eades published the first paper on force-directed layout algorithms in "A Heuristic for Graph Drawing" (Congressus Numerantium, 42, 1984). The algorithm discussed in this article was first described in Arne Frick, Andreas Ludwig, and Heiko Mehldau's "A Fast Adaptive Layout Algorithm for Undirected Graphs" (Proceedings of Graph Drawing '94, Volume 894 of Lecture Notes in Computer Science, edited by Roberto Tamassia and Ioannis Tollis, Springer-Verlag, 1995).
Eades modified the physical model with attractive spring forces between adjacent nodes by assigning each pair of nodes a virtual repulsive force. This helps guarantee that topologically near nodes are placed in the same vicinity, and far nodes are placed far from each other. The attractive force between two nodes, u and v, is defined in Figure 1, where -- uv is the distance between the nodes u and v. The term (v-u)/ -- uv is the normalized distance vector between the points u and v, meaning that it has a length of 1. c1 and c2 are constants. c1 allows us to adjust the attractive force between two nodes. c2 equals the desired edge length.
The repulsive force between u and v is defined in Figure 2, where c3 should be approximately the square of the desired edge length (that is, c22). Increasing c3 iteratively helps reduce the number of overlapping nodes.
The sum of the force vectors determines which direction the node should move. We could define a constant c4 for the step width, which would determine how far a node could move in a single step. Using a constant step width works well for small (30-50) node graphs, but performance quickly degrades as graphs grow in size. The algorithm moves nodes that are far out of place to the final position more quickly, but can also miss good positions by jumping over them, possibly leading to divergence and/or oscillations. There is no guarantee that the system reaches equilibrium at all. Because of this, we do not use a constant step width. Instead, we use the notion of temperatures.
One way to improve the spring-embedder system is to introduce temperatures that are better able to control the algorithm's performance. In their paper, "Graph Drawing by Force-Directed Placement" (Software: Practice and Experience, 21(11), 1991), T.M.J. Fruchterman and E.M. Reingold introduced a global temperature that controls the step width of node movements and the algorithm's termination. The step width is proportional to the temperature, so if the temperature is hot, the nodes move faster (that is, a larger distance c4 in each single step). The temperature is the same for all nodes, and cools down at each iteration. Once the nodes stop moving, the system terminates.
Although using a global temperature will create an overall satisfying layout, there will be deficiencies in some local areas of the graph. To improve the situation, we will remodel the system using the "adaptive temperature scheme" based on local temperatures:
I(v)=F(v)/norm(F(v))
- In each step, the node moves in the direction of the impulse by the distance of the local temperature; the new position P(v) is calculated from the old position by:
P(v)=P(v)+T(v)·I(v)
Detecting Oscillation and Rotation
The next improvement is a heuristic that detects certain undesired system behavior that prevents termination, such as never-ending cycles that do not converge to any single state. Two such phenomena are oscillations, which are back-and-forth movements of a particular node in subsequent steps, and rotations, which are continuous angular movements of portions of the graph, effectively preventing convergence.
The Frick/Ludwig/Mehldau paper presented a novel technique to detect and recover from this unstable, cyclical behavior that compared the current impulse with the previous impulse. If a node is moving in the same general direction, we assume it's moving to a stable position, and we increase the temperature to accelerate its movement. A node moving in opposite directions between iterations suggests oscillation, so we decrease the temperature.
Figure 3 shows the current and previous impulse of a node. If the angle alpha between the old and new impulse I(v) is between -45 and 45 degrees, we increase the temperature to accelerate the movement. If the angle is between 135 and 225 degrees, we assume that the node is beginning to oscillate, and we decrease the temperature. The cosine function reflects our desired behavior -- positive in the acceleration range, and negative in the oscillation range -- so the temperature equation used is:
T(v)=T(v)+c5·cos(a)·T(v),
where c5 is an oscillation constant. If you run a layout and it doesn't terminate or nodes do not reach their optimal position, oscillations could be the cause. By increasing c5, you increase the likelihood of nodes avoiding oscillations and reaching their optimal position. However, the larger c5 is, the longer layout takes.
To detect rotation, we maintain a counter for each node called S(v). If the node has moved to the right after an iteration (a is between 45 and 135 degrees), S(v) is increased; if the node has moved to the left (a is between 225 and 315 degrees), S(v) is decreased. The value of sin(a) has this desired behavior.
If the node movement is approximately straight, there are as many left turns as right turns, and S(v) will be zero. However, after many turns in the same direction, the absolute value of S(v) will be large. The temperature T(v) decreases by a value relative to the absolute value of S(v):
S(v)=S(v)+c6·sin(a)
T(v)=T(v)-c7·abs|S(v)|,
where c6 is a skew constant and c7 is a rotation constant. c6 measures how sensitive the algorithm is to detecting rotations. A larger c6 means more sensitivity. c7 affects the speed of rotation termination. A larger c7 means the temperature will decrease by a large amount, so the algorithm should converge faster.
Sometimes, the layout converges to a quasistable local optimum that is still far from the globally optimal layout. Consider a ball that is balanced on top of a pole. Even though all of the forces are in equilibrium, any tiny movement of the ball would cause the ball to fall down. In a layout, the same can happen if no nodes are moving anymore, but any tiny stimulation of movement would cause the system to converge toward a better layout. To eliminate degenerate node positions and force this better equilibrium, we add a random force, similar to a slight shake when sieving sand. The random force should be small compared to the other forces.
Frand(v)=small random vector
In some cases, an additional gravitational force towards the barycenter (the center of gravity) of the drawing helps keep several loosely connected clusters of nodes together. The barycenter is the sum of the positions of all nodes divided by the number of nodes (the average position of all nodes). The additional gravitational force is expressed in Figure 4, where c8 is a gravitational constant. In this formula, the mass of a node is defined as (1+degree(v)/3). The more edges connected to the node, the more important the node, and hence, the higher the mass.
Listing One is a Java implementation of our spring-embedder variant. The SpringEmbedder class uses constants to set values such as the maximum number of iterations, rotation, skew, gravitational, and attractive force. The values of these constants depend on experience and practice. We've defined arrays for nodes, the node temperatures, the old impulse of nodes, and the node skew values. SpringEmbedder's key methods are described in Table 1. The nodes, edges, and graph are instances of classes in the Graph Layout Toolkit (from Tom Sawyer, the company we work for).
Node is an instance of class TSDNode, which represents a node (vertex) in a graph. Each node knows about the edges that are connected to it, stored in an incoming and outgoing edge list. SpringEmbedder uses TSDNode methods setCenter, getCenter, and inEdges for calculating node positions and iterating through adjacent edges to calculate forces.
Edge is an instance of class TSDEdge, which represents an edge (link) in a graph. SpringEmbedder uses TSDEdge methods getSource and getTarget to determine what nodes are attached to an edge. Each edge connects a source node to a target node.
Graph is an instance of class TSDGraph, which manages a collection of nodes and edges. In our example code, we use getCenter to find the barycenter or the graph to calculate the gravitational force.
In each round, a node is selected, all forces are calculated to determine the new impulse, and, depending on the old impulse, the new temperature is calculated. The node is then moved in the direction of its new impulse with a step width proportional to the node temperature. The algorithm terminates when either the maximal number of rounds is exceeded, or the overall temperature of the nodes falls below a limit.
It turns out that the algorithm is more stable if the order in which nodes are selected differs in each round. Another approach would be to calculate the new positions of all the nodes first, then move the nodes by rounds. In comparing the single-step algorithm to move-by-rounds, we've found advantages and disadvantages. The single-step algorithm described here tends to give the best results in terms of stability.
The layouts produced by our algorithm have a high degree of symmetry, a homogeneous distribution of nodes, and edges of uniform lengths. However, since the edges are straight, they sometimes will cross some nodes. To correct this, we use the following approach. For each edge that crosses a node, we introduce three bend points by splitting the edge into four edges and three dummy nodes. Next, we run the same spring embedder as described before, except we only select dummy nodes for movement so the real nodes aren't moved. Because the real nodes contribute to the repulsive force of the dummy nodes, the final position of the dummy nodes will most likely not overlap real nodes. Hence, the routing of an edge over the three dummy nodes will bypass, not cross, real nodes.
Figure 5 shows a graph in different stages of the iteration. Starting from the random position in Figure 5(a), the movements based on the ordering forces unwind the graph, resulting in Figure 5(e). The last picture in the sequence, Figure 5(f), shows the final fine-tuning step that bends the six edges that cross nodes.
Graph layout is becoming more important as applications require visualization of complex data and relationships. The spring-embedded algorithm is universal and can be used to visualize any kind of undirected network. It easily exposes symmetries in the network, drawing each edge with a uniform length. For more information on graph layout, see http://www.tomsawyer.com/.
DDJ
import tomsawyer.util.*;
import tomsawyer.graph.*;
import tomsawyer.util.compatibility.*;
import java.util.*;
class SpringEmbedder
{
/** The main Iteration method doing the spring embedder.
* The nodes are stored in a node array (a vector) and have an index.
* We use three additional arrays:
* temperature[nodeIndex] is the temperature of the node T(v).
* oldImpulse.elementAt(nodeIndex) is the old impulse of the node I(v),
* which is a 2-dimensional vector of type Vector2D.
* skew[nodeIndex] is the skew value S(v) that indicates how often the
* node rotated into the same direction. */
public void mainIteration()
{
int iteration = 0;
while(( iteration < MAXITERATION) &&
(this.temperatureSum() < MAX_TEMPERATURE_SUM))
{
iteration = iteration + 1;
List li = this.createRandomNodeNumberList();
Iterator iter = li.iterator();
// catch every node once
while(iter.hasNext())
{
// get the node number in the node array and node
int nodeIndex = ((Integer) iter.next()).intValue();
TSDNode node = (TSDNode) (this.nodeArray.elementAt(nodeIndex));
// compute the Impulse of the node
Vector2D impulse = this.computeNodeImpulse(node);
this.adjustTemperatureAndSkew(nodeIndex,Vector2D.angle(
(Vector2D)this.oldImpulse.elementAt(nodeIndex),impulse));
// adjust local position
node.moveBy(
STEP_CONSTANT_C_4 *
impulse.getX() * this.temperature[nodeIndex],
STEP_CONSTANT_C_4 *
impulse.getY () * this.temperature[nodeIndex]);
((Vector2D) this.oldImpulse.elementAt(nodeIndex)).set(impulse);
}
}
}
/** This method adjust the temperature of the node according to the
* old temperature and the the old impulse */
public void adjustTemperatureAndSkew(int nodeIndex, double angle)
{
// Note that angle is a positive value between 0 and approx. 6.283
// i.e. 2 pi. Hence, 45 degree is 0.707 and -45 degree is
// 6.283 - 0.707 = 5.576
if (angle < 0.707 || angle > 5.576 ||
(angle > 2.434 && angle < 3.848))
{
// between -0.707 and 0.707 (-45 and 45 degree), there is
// acceleration, between 2.434 and 3.848 (145 and 225 degree)
// there is oscillation, described by this temperature scheme:
this.temperature[nodeIndex] =
this.temperature[nodeIndex] *
(1 + OSCI_CONSTANT_C_5 * Math.cos(angle));
}
else
{
// in the other ranges of the angle, there is rotation:
this.skew[nodeIndex] =
this.skew[nodeIndex] + SKEW_CONSTANT_C_6 * Math.sin(angle);
this.temperature[nodeIndex] = this.temperature[nodeIndex] -
ROTA_CONSTANT_C_7 * Math.abs(this.skew[nodeIndex]);
}
}
/** This method computes the normalized impulse of the node by
* the sum of all force vectors. */
public Vector2D computeNodeImpulse(TSDNode node)
{
Vector2D resultForce = new Vector2D();
// Iterate through all adjacent edges to calculate attractive forces
Iterator adjacentEdgesIter = node.inEdges().iterator();
while (adjacentEdgesIter.hasNext())
{
TSDEdge edge = (TSDEdge) adjacentEdgesIter.next();
// compute the other adjacent node
TSDNode otherNode = (TSDNode) edge.getTargetNode();
if (otherNode == node)
{
// it has to be the other node we do not allow self loops
otherNode = (TSDNode) edge.getSourceNode();
}
// add all attractive forces together
resultForce = resultForce.
add(this.attractiveForce(node, otherNode));
}
// Iterate through all other nodes to calculate repulsive forces.
Iterator nodeIterator = graph.nodes().iterator();
while (nodeIterator.hasNext())
{
TSDNode otherNode = (TSDNode) nodeIterator.next();
if (otherNode != node)
{
resultForce = resultForce.
sub(this.repulsiveForce(node, otherNode));
}
}
// Finally add the random force and the gravitaional force
resultForce = resultForce.
add(this.gravitionalForce(node)).
add(this.randomForce());
// we only need the impulse
return (this.impulse(resultForce));
}
/** This method returns the impulse vector of a force vector. */
public Vector2D impulse (Vector2D vector)
{
return (new Vector2D(vector).div(vector.norm()));
}
/** This method computes the attractive force between two nodes */
public Vector2D attractiveForce(TSDNode node1, TSDNode node2)
{
// We need the squared distance of the two nodes
double distance = node1.getCenter().distance(node2.getCenter());
// create Vectors out of the node positions
Vector2D node1Vector = new Vector2D(node1.getCenter());
Vector2D node2Vector = new Vector2D(node1.getCenter());
// (c_1 * (node1 - node2) / distance * (log( distance / c_2)))
// c_2 should be approx. the desired edge length.
return (node1Vector.sub(node2Vector).
div(distance).
mul(Math.log(distance / ATTR_CONSTANT_C_2)).
mul(ATTR_CONSTANT_C_1));
}
/** This method computes the repulsive force between two nodes */
public Vector2D repulsiveForce(TSDNode node1, TSDNode node2)
{
Vector2D resultForce = new Vector2D();
// We need the squared distance of the two nodes
double distance = node1.getCenter().distance(node2.getCenter());
if (distance != 0.0)
{
double tripleDistance = distance * distance * distance;
// create Vectors out of the nodes
Vector2D node1Vector = new Vector2D(node1.getCenter());
Vector2D node2Vector = new Vector2D(node1.getCenter());
// (node1 - node2) * c_3 / distance ^ 3
// c_3 should be approx. the square of the desired edge length.
resultForce =
node1Vector.sub(node2Vector).
mul(REPL_CONSTANT_C_3).
div(tripleDistance);
}
return(resultForce);
}
/** This method computes the random force. */
public Vector2D randomForce()
{
// We need a random impulse [-RAND_CONSTANT, ..., +RAND_CONSTANT]
// Range should be approx. [-1/4 ... +1/4] of desired edge length.
double maxRandom = RAND_CONSTANT;
return new Vector2D(
(this.random.nextDouble() * maxRandom * 2) - maxRandom,
(this.random.nextDouble() * maxRandom * 2) - maxRandom);
}
/** This method computes the gravitional force between a node and
* the barycenter of the drawing. */
public Vector2D gravitionalForce(TSDNode node)
{
// compute the mass with the node degree
double phi = 1 + node.degree() / 3;
// create vectors from position of node and center of graph
Vector2D barycenter = new Vector2D(this.graph.getCenter());
Vector2D nodeVector = new Vector2D(node.getCenter());
// GRAV_CONSTANT_C_8 * ( barycenter - nodeVector) * phi
return(barycenter.sub(nodeVector)).mul(phi).mul(GRAV_CONSTANT_C_8);
}
/** This method returns the sum of all temperatures */
public double temperatureSum()
{
double result = 0;
for( int i = 0; i < this.graph.nodes().size(); i++)
{
result = result + this.temperature[i];
}
return (result);
}
/** This method returns the list of nodes in a random sequence. */
public List createRandomNodeNumberList()
{
// ... to be filled in ... Creating good random sequences is beyond
// the scope of this paper and can be a separate paper.
return (null);
}
//------------------------------------------------------------
public static int MAXITERATION = 10000;
public static double ATTR_CONSTANT_C_1 = 1.0;
public static double ATTR_CONSTANT_C_2 = 8.0;
public static double REPL_CONSTANT_C_3 = 64.0;
public static double STEP_CONSTANT_C_4 = 1.0;
public static double OSCI_CONSTANT_C_5 = 1.0;
public static double SKEW_CONSTANT_C_6 = 1.0;
public static double ROTA_CONSTANT_C_7 = 1.0;
public static double GRAV_CONSTANT_C_8 = 1.0;
public static double RAND_CONSTANT = 2.0;
public static double MAX_TEMPERATURE_SUM = 1.0;
public static int MAX_NODES = 1000;
public Random random = new Random();
public TSDGraph graph = new TSDGraph();
Vector oldImpulse = new Vector(MAX_NODES);
Vector nodeArray = new Vector(MAX_NODES);
double temperature[] = new double[MAX_NODES];
double skew[] = new double[MAX_NODES];
}