Oh, Behave

Web Tools Tame Molecular Behavior, Make Advances Toward Drug Discovery

Social networking tools such as Twitter and Facebook have changed what it means to tweet and to friend. Likewise, an emerging wave of Web 2.0 tools aims to make fundamental changes in what it means to model.

One such tool, called Cellucidate, helps biologists turn static network diagrams of signaling pathways into living and breathing systems. The near-term goals of the toolkit, which include using modeling and simulation to speed drug discovery and cure cancer, are lofty in and of themselves. Yet the scientists who developed it also have the heady vision that this tool will help scientists uncover general principles about cellular signaling. If such laws exist, biologists may someday pencil them in beneath the laws of motion, gravity, and thermodynamics.

The technology behind Cellucidate is the outcome of years of work led by Walter Fontana, HMS professor of systems biology. Fontana and colleagues have published most of this research in computer science journals, but their first paper about it for a broader audience appears in the April 21 Proceedings of the National Academy of Sciences.

The Numbers Game

To understand how these new tools are changing modeling, it helps to walk in the shoes of a modeler. A biologist interested in modeling cell signaling will first pull together all she knows about a pathway and build a network map. To bring this static map to life, she must translate it into a set of differential equations that reveal the dynamics of the system.

This is where the trouble starts. The modeler will quickly be swamped by the complexity of even the simplest signaling network. After factoring in genetic differences, posttranslational modifications and complex folding, “the number of possible molecular states in a cell is staggering,” said Fontana. Even a relatively limited study of epidermal growth factor (EGF) signaling, a pathway implicated in some forms of cancer, produces 71 rules and a whopping 10^19—millions of trillions—potential unique molecular species involved. To build a traditional mathematical model, “you would end up writing billions of equations.”

Sufficiently stymied, our modeler might consider simplifying the system. “But how do you know you’ve simplified it correctly?” asked Fontana. “That complexity may tell us how cancer cells elude our treatments.”

Essential Behavior

Fontana proposes an alternate approach. His method employs a computer language called Kappa. Designed in 2004 by coauthor Vincent Danos, professor of informatics at the University of Edinburgh, Kappa is “tuned to express basic interactions between proteins,” said Fontana. Instead of writing equations, the modeler uses Kappa to craft rules that represent these interactions.

The rules specify patterns that highlight only the information needed to decide whether or not to apply a rule. Kappa rules do not throw out the complexity, said Fontana. Rather, they focus only on the pattern relevant to the interaction at hand, thereby distilling dynamic processes to their essential elements.

By solving the numbers problem, Kappa also adds transparency to modeling. Rather than representing the model in the obtuse terms of advanced calculus, the modeler now has a graphical map that clearly illustrates the empirical knowledge she has drawn on to build it. The benefit? “Another scientist can look at the rules, and they can discuss whether they believe them or not,” said coauthor and former HMS research fellow Jean Krivine.

Rules ready, our modeler now wants to run a simulation. But if her model is large, she will hit another wall. Kappa’s algorithms send virtual molecules bumping into one another according to the rules, rates and concentrations the modeler specifies. For larger models, the number of computational steps required to run this randomly driven simulation quickly becomes intractable. “The computational cost is too high,” said Fontana. The only way to run a large simulation is to “go back to paradise.” That is, differential equations.

Caught between a supercomputer and a math text, Fontana decided to build new technology to solve this problem. He coupled the Kappa language with a method called coarse-graining, the brainchild of first author and former HMS research fellow Jerome Feret, Danos, and Krivine.

Coarse-graining takes in Kappa’s rules, performs a dependency analysis on them, and spits out compressed differential equations. Fontana’s team “coarse-grained” a subset of the EGF signaling pathway. The Kappa model included 71 rules and a calculated 18,051,984,143,555,729,567 distinct molecular species, meaning that the resulting equations potentially involve that many distinct variables. Coarse-graining filtered out the permutations that have no bearing on the dynamics and concluded that only 175,988 variables mattered, a reduction of 15 orders of magnitude. The automated compression took only 15 minutes to run.

The compression makes intuitive sense, said Krivine. “If all the species mattered, then you should be able to observe 10^19 different phenotypes of a cell. But that’s never the case. A cell has only a few behaviors.”

The ability to compress the model begs the question of whether general principles govern those few cellular behaviors. The same way that physicists can model the air in a room with a handful of variables—pressure, volume, temperature—rather than by writing a differential equation for each unique air molecule, perhaps biologists can find analogous factors that model programmed cell death.

“If empirically we find out that we have to describe everything at the level of molecular species and fully specified molecular players, well, we’re out of luck,” said Fontana. “There is nothing to understand.”

Using these tools, our modeler can now create complex models and run simulations of them. But Fontana sees an even greater opportunity. In 2005 he helped found a company called Plectix BioSystems that provides a shared space online for scientists to use Cellucidate, a web-based graphical interface built on top of Kappa and, soon, the coarse-graining algorithms. “Plectix wants to be the Facebook of proteins,” said Fontana, who is on the Plectix board of directors.

Fontana imagines that Cellucidate will become the place where scientists register the protein–­protein interactions they discover the same way they add genes and crystal structures to other public databases. Someday, he said, “scientists will make models collaboratively.” They will build “consensus views of a system,”and those models, in turn, will evolve apace with discovery. Similar to Facebook, access to the beta version of Cellucidate is free, and the current plan is to make its underlying algorithms open source.

Students may contact Walter Fontana at walter_fontana@hms.harvard.edu for more information.

Conflict Disclosure: Walter Fontana is a member of the board of directors of Plectix BioSystems, Inc. Coauthors Jereme Feret, Vincent Danos, Jean Krivine and Russ Harmer are consultants for Plectix BioSystems Inc.

Funding Sources: Harvard Medical School funded the work described in the PNAS paper. Plectix BioSystems receives venture funding from Sevin Rosen Funds and OVP Venture Partners; the content of the work is the responsibility solely of the authors.