Automatic Programming

Automatic programming relies on middleware to make decisions for programmers. One of the most difficult decisions concerns how to distribute software across machines in a network. Once instrumentation has determined how much communication goes on, the key to successful distribution is to partition the code in a way that disrupts only the least important links. This is the mathematical problem known as finding the least cut set in a graph, which unfortunately is NP complete for more than two machines. However, we've found a set of heuristics that, in practice, find an optimal solution in close to linear time.

Suppose we have a graph that looks like Figure 2, where nodes represent machines and edges represent communication links. The edges are weighted to reflect the amount of communication between the nodes: The higher the number, the more the edges communicate. Any cut that involves cut a can be improved by using cut b, because the sum of the weights cut by b is less than the sum of the weights on a. In other words, the two nodes connected by the edge of weight 100 will end up on the same machine in an ideal cut; otherwise, the cut wouldn't be ideal and could be improved. If we collapse that edge before we start cutting, any solution to the problem with the edge collapsed yields an optimal solution to the original problem. And the new problem is smaller. Finding and collapsing these dominant edges can be done very quickly. Once you've collapsed one dominant edge, you may create another, which can in turn be collapsed. Often this simple idea can halve the size of a graph.

-- B.B., J.R., J.V., and M.W.

Back to Article