Seeing is believing. If you want to know how a molecule works, you need to discover its shape. This is the mantra of structural biologists.
The architectural details of individual molecules are much too small for the naked eye to discern and even too tiny for the strongest microscope. It takes the finesse of a dozen or more software programs and the brute force of many more high-powered computers to extract a single meaningful molecular “image” from measurements of vast numbers of identical molecules.
Despite their huge reliance on computational methods, most structural biologists are not that different from the rest of us. They do not want to spend time and money troubleshooting software and hardware problems. They want to get on with the science.
For nearly a decade, HMS structural biologist Piotr Sliz and his team at HMS have been answering these cries for help in structural biology computing. They work under the name of SBGrid. Their efforts are transforming the practice of structural biology in unexpected ways, making the tools and techniques available to other biologists and enabling structural specialists to tackle more recalcitrant problems.
The bulk of the team’s time is taken up with integrating, updating, and de-bugging software programs used to crunch data from the three core experimental branches of structural biology—X-ray crystallography, electron microscopy (EM), and nuclear magnetic resonance.
Their main service is installing, maintaining, and frequently upgrading a wide range of complex applications for 90 research groups coast to coast and a few overseas, all remotely via the Internet.
“Taking care of our [HMS] labs is 80 percent of our work,” said SBGrid software engineer Ben Eisenbraun. “The biggest effort goes into the initial product. Once that is done, copies are cheap. Having a guy like me in a hundred different labs doesn’t make sense. The work I do can support a hundred labs.” Eisenbraun, together with part-timer Ben Silva, oversees licensing and maintenance of 175 software programs accumulated at SBGrid over the last nine years, about 30 of which are most commonly used.
“There’s no way I could do this myself,” said Ben Spiller, who recently convinced his Vanderbilt University colleagues in the Center for Structural Biology to switch to SBGrid. “The programs are well vetted, checked, tested, and used by 90 labs, many of which have created the software. You just plug in your Mac and go.”
“It makes my life easier,” said Axel Brunger, a Howard Hughes investigator at Stanford University, who has made important computational advances and incorporated them into the popular software called CNS. SBGrid users receive CNS updates for beta testing before they are more widely available. Brunger, in turn, uses other SBGrid software for data analysis and molecular graphics.
“But the bigger picture is that more and more labs can use structural biology—not just labs that are involved in developing methodology, but also labs with totally biological and biochemical backgrounds using structure as a tool,” said Brunger. “Pure structural biology labs can look at very difficult problems,” such as large proteins and complexes, low-resolution crystal structures, combining structural methods, and more.
Infectious ComputingSBGrid started modestly as an effort to integrate the computing of the Howard Hughes laboratories of Stephen Harrison and the late Don Wiley, which ran separate systems on Harvard’s Cambridge campus and at Children’s Hospital Boston. Sliz, who joined the Harrison and Wiley laboratories as a postdoctoral fellow in 1999, had coordinated the setup of structural biology applications between two labs in Canada when he was a graduate student. This time, the project expanded.
“People who were leaving the [Harrison-Wiley] labs [to start their own groups] wanted to have the same computing environment,” said Sliz, a lecturer on pediatrics at Children’s. He obliged. The SBGrid user community has since grown organically (see map).
For the first two years, Sliz single-handedly provided the service part time. In 2002, when the Harrison–Wiley group moved from Cambridge to the Quad, he was supporting 15 labs. But Sliz needed more time for his own structural biology. He could not keep doing the computer work all by himself.
“Hughes allowed me to recruit a high-level person like Piotr who was a senior postdoc and also a computer guru interested in novel computational schemes whose responsibility was to the community, not to me,” said Harrison, who was a beneficiary of donated time on a Paris supercomputer when he solved the first-ever virus structure in 1977. Now Sliz has four full-time and one part-time person keeping SBGrid running smoothly behind the scenes of the science. “We wanted to turn structural biology into an infectious disease and make sure everybody caught it,” Harrison said.
Divide and MultiplyIn and around the HMS Quad, SBGrid offers more. The scientists developed a high-quality computing infrastructure, now incorporating the system’s 15 research groups, including those of Tom Walz, James Chou, Jim Hogle, Suzanne Walker, Tom Rapoport, Anjana Rao, Tim Springer, and Bing Chen. On the SBGrid team, Ian Levesque oversees the Harvard infrastructure with the help of junior system administrator John Burns.
“Structural biologists are extreme computer users,” said Walz, HMS professor of cell biology, who specializes in EM. “SBGrid is a huge help.”
Connected by its own fiber-optic network, SBGrid pools processor power for each lab’s sporadic but intensive computing needs. Conceptually, the system works like the famous SETI@home program that searches the cosmos for alien life by running calculations on people’s desktops. SBGrid works similarly by breaking down the problem among a large number of scientifically dedicated but often idle computers. The physical grid greatly beefs up the computational muscle available to any one researcher without a major investment in more computers.
“At every stage in structural biology—and this is especially true in EM—one is limited in what one can achieve by what one can compute,” Harrison said. “There is a frontier component beyond the service that involves innovative ways of organizing calculations so they are adapted to the grid environment.”
Last year, thanks to a grant from the National Science Foundation, SBGrid joined the Open Science Grid, a nationwide initiative. This expands the potential computing power available to all hardwired local SBGrid members and, conversely, makes the structural biology resources more widely available.
“Simulation and data analysis are a central part of modern science. You don’t have a lot of people at blackboards thinking up theories,” said Ian Stokes-Rees, a postdoctoral fellow in the Sliz lab.
SBGrid member Rachelle Gaudet, a structural biologist at Harvard, is looking forward to the first-ever SBGrid users symposium, “Quo Vadis Structural Biology,” at the new research building May 5 to 6. She first joined for the software, welcomed the extra computing power, and anticipates benefiting from the structural biology and computational community built by SBGrid.