Folding Revolution

New deep-learning approach predicts protein structure from amino acid sequence

protein structure
Proteins function by folding into myriad, precise 3D structures. Image: Mohammed AlQuraishi

Nearly every fundamental biological process necessary for life is carried out by proteins. They create and maintain the shapes of cells and tissues; constitute the enzymes that catalyze life-sustaining chemical reactions; act as molecular factories, transporters and motors; serve as both signal and receiver for cellular communications; and much more.

Composed of long chains of amino acids, proteins perform these myriad tasks by folding themselves into precise 3D structures that govern how they interact with other molecules. Because a protein’s shape determines its function and the extent of its dysfunction in disease, efforts to illuminate protein structures are central to all of molecular biology—and in particular, therapeutic science and the development of lifesaving and life-altering medicines.

Get more HMS news here

In recent years, computational methods have made significant strides in predicting how proteins fold based on knowledge of their amino acid sequence. If fully realized, these methods have the potential to transform virtually all facets of biomedical research. Current approaches, however, are limited in the scale and scope of the proteins that can be determined.

Now, a Harvard Medical School scientist has used a form of artificial intelligence known as deep learning to predict the 3D structure of a protein based on its amino acid sequence.

Reporting online in Cell Systems on April 17, systems biologist Mohammed AlQuraishi details a new approach for computationally determining protein structure—achieving accuracy comparable to current state-of-the-art methods but at speeds upward of a million times faster.

“Protein folding has been one of the most important problems for biochemists over the last half century, and this approach represents a fundamentally new way of tackling that challenge,” said AlQuraishi, instructor in systems biology in the Blavatnik Institute at HMS and a fellow in the Laboratory of Systems Pharmacology. “We now have a whole new vista from which to explore protein folding, and I think we’ve just begun to scratch the surface.”

One remarkable feature of AlQuraishi’s work is that a single research fellow, embedded in the rich research ecosystem of Harvard Medical School and the Boston biomedical community, can compete with companies such as Google in one of the hottest areas of computer science.

Peter Sorger

Otto Krayer Professor of Systems Pharmacology

Accurately and efficiently predicting protein folding has been a holy grail for the field ... We might solve this soon, and I think no one would have said that five years ago. It’s very exciting and also kind of shocking at the same time.

Mohammed AlQuraishi

Predicting protein structure from sequence is a central challenge of biochemistry. Co-evolution methods show promise, but an explicit sequence-to-structure map remains elusive. Advances in deep learning that replace complex, human-designed pipelines with differentiable models optimized end-to-end suggest the potential benefits of similarly reformulating structure prediction. Here we report the first end-to-end differentiable model of protein structure.
Mohammed AlQuraishi presented his work on protein folding at a seminar at the Broad Institute of MIT and Harvard in March.