Dr. Dobb's Journal, January 2006

Fractal Biology

By Dennis E. Shasha

Dennis, a professor of computer science at New York University, is the author of four puzzle books: The Puzzling Adventures of Dr. Ecco (Dover, 1998); Codes, Puzzles, and Conspiracy (Freeman 1992, reprinted by Dover in 2004 as Dr. Ecco: Mathematical Detective); and recently Dr. Ecco's Cyberpuzzles (W.W. Norton, 2002); and Puzzling Adventures (W.W. Norton, 2005). With Philippe Bonnet, he has written Database Tuning: Principles, Experiments, and Troubleshooting Techniques (2002, Morgan Kaufmann). With Cathy Lazere, he wrote Out of Their Minds: The Lives and Discoveries of 15 Great Computer Scientists (1995, Copernicus/Springer). He can be contacted at DrEcco@ddj.com.

Dr. Ecco Solution

Our visitor that Saturday morning, an attractive woman with dark hair and a mischievous gleam in her eyes, introduced herself as Gloria Dopsis.

"I'm a biologist," she said. "I used to study genes, but now I study genomes, proteomes, interactomes, basically anything omic; that is, anything having to do with entire species. It's a perspective that might hurt many delicate human egos.

"For example, we have roughly the same number of genes as mice and most are very similar in form and function. We think of ourselves as a higher form of life, but the genome itself doesn't bear that out. The difference must be in the interactions, the networks of binding and repulsion that give rise to the unique capabilities of each species.

"Lately, I've been trying to figure out why protein interaction networks have the topologies they do. You see, most proteins have very few connections and very few have many connections. We call those the hubs. This gives rise to "scale-free" networks reminiscent of Mandelbrot's fractals or, in fact, the Internet.

"One theory is that scale-free networks have better failure properties. The thinking goes like this: Because mutations strike randomly at the genome, most mutations will wound proteins that interact with very few other proteins, thus are presumably less important. Fatal mutations of hub proteins can occur, too, but they are rare because hubs are rare.

"I'm here to ask your help to determine how many interaction links are needed to achieve high fault tolerance while ensuring that no unwounded protein is more than two interaction links away from another one."

Liane had slipped in shortly after Professor Dopsis arrived and had followed the discussion closely.

"Professor, could we try a little warm up?" she asked. "If there were just four protein nodes, then a simple square of interactions in Figure 1 would achieve this goal. Removing any node allows any remaining pair of nodes to communicate over at most two links."

"Well done," Dopsis replied. "Here's a second warm up. Can you achieve this distance two condition for six nodes even if one node is wounded (call that the "wounded distance two condition"), using 10 links or fewer assuming no node can have more than three interaction links?"

Liane came up with a nine-link solution to this warm up; see Figure 2.

"You're good," Dopsis said with her playful smile. "Here are more questions for you:

"1. Can you achieve the wounded distance two condition for eight nodes using 16 links, assuming no node can have more than four interaction links?
"2. What is the fewest number of links needed to achieve the wounded distance two condition for 12 nodes and at most five interaction links for any node?
"3. What is the fewest number of links needed for 12 nodes, but without any limit on the number of interaction links any particular node can have?
"4. We have a particular pathway having 108 proteins. We don't yet know all the interactions. What is the fewest number of links needed to achieve the wounded distance two condition for any pair among these 108 nodes if there is a limit of 60 interactions that any single node can have?"

Ecco and Liane solved these problems that same morning. Professor Dopsis, in the meantime, told me of her work in the Amazon and how much she loved adventure. After hearing the answers, she left.

"She might come back," Ecco said. "The network problems here fit a general formulation that she will eventually face: Can you find the minimum number of links necessary for N nodes, maximum distance D between any pair of unwounded nodes, maximum degree (number of interactions per node) K, and up to X wounded nodes?"

Ecco made no attempt to solve the problem. Would you like to give it a try?

Further reading for the inspirations of this puzzle include "Regulation of Metabolic Networks: Understanding Metabolic Complexity in the Systems Biology Era," by Lee Sweetlove and Alisdair Fernie (New Phytologist 2005, 168: 9-24) and "Evolutionary Capacitance as a General Feature of Complex Gene Networks," by Aviv Bergman and Mark Siegal (Nature, July 31, 424(6948):549-52).

DDJ