A Galaxy of Drug Candidates

New software screens billions of compounds in search of new medicinesincluding for COVID-19

screen shot of Christoph Gorgulla

One scientist. A billion experiments. Video: Rick Groleau

This article is part of Harvard Medical School’s continuing coverage of medicine, biomedical research, medical education and policy related to the SARS-CoV-2 pandemic and the disease COVID-19.

UPDATE, MARCH 18: Amid the COVID-19 global pandemic, this week the team began using VirtualFlow to screen for drugs that could disrupt coronavirus proteins. Google has given the researchers funds so they can use its cloud computing platform.

The researchers emphasize that in silico screening is not likely to provide an immediate solution to the pandemic, compared to efforts to repurpose existing drugs, if those efforts are successful. But identifying a new class of drugs in the long term that specifically and potently target the virus that causes COVID-19 would be valuable, they say.

"Once the curve is flattened, new drugs—if we can successfully identify them, and especially if they can be made inexpensively—could prevent further spread of the disease," said Arthanari.

"In particular now as many people are being directed to stay at home during the pandemic, virtual drug discovery is a very attractive alternative to wet lab-based research since it can be done from home," added Gorgulla. "Once labs are open again, researchers will be armed with potential hits ready for experimental validation."


Developing new drugs to treat diseases isn’t easy.

As many as 90 percent of promising drug candidates fail before or during human clinical trials, falling into the so-called “valley of death.”

Those that do succeed still cost significant time and money. The Pharmaceutical Research and Manufacturers of America estimates that it takes an average of $2.6 billion and more than 10 years for a new medicine to hit the market.

Get more HMS news here

One way researchers have been attacking the problem is by starting with computers rather than laborious bench experiments to identify compounds that best match the desired treatment target in the body, like sifting through millions of microscopic, three-dimensional puzzle pieces.

After running a rapid virtual screen, scientists can focus their time and budget on testing the top candidates in the lab.

Now, taking in silico screening a step forward, an international team led by Harvard Medical School researchers has developed software capable of preparing and screening billions of compounds. Such programs were previously limited to sifting through about 1 million to 10 million compounds each.

Out of the box, so to speak, the software comes ready to screen 1.4 billion compounds, the size of the largest existing database of prepared chemical compounds.

The new capacity vastly increases the chances of finding the compound that will hit the target head-on.

“The more compounds you can screen, the better your top candidates will be, and the lower your rate of false positives,” said the study’s first author, Christoph Gorgulla, a postdoctoral fellow in the labs of Gerhard Wagner and Haribabu Arthanari in the Blavatnik Institute at HMS and an associate of the Department of Physics at Harvard University.

The findings were published March 9 in Nature.

In silico screening isn’t yet able to test individual compounds as thoroughly as bench experimentation, but it compensates with the ability to test orders of magnitude more of those compounds, said Arthanari, assistant professor of biological chemistry and molecular pharmacology at HMS and co-senior author of the study.

“It’s like having a dart board,” said Arthanari, whose lab is also affiliated with Dana-Farber Cancer Institute. “Virtual screening may dim the lights as if you’re in the back room of a bar, but it gives you many more darts to throw, so you have a better chance of hitting the target.”

Sky’s the limit

The software, called VirtualFlow, is free and open source. To make it even more widely accessible, the team designed it to be easy to use for nonspecialists and able to run on a range of computing powers, including typical computer clusters at universities as well as cloud services.

For instance, a modest computer cluster of 300 cores would allow VirtualFlow to screen 100 million compounds in six weeks, while a 1,000-core cluster could do it in two, the authors said. A cluster of 10,000 cores, as is available at HMS, could screen 1 billion compounds in the same two weeks.

“VirtualFlow truly democratizes ultra-large-scale screening,” said Arthanari. “You don’t have to be at Harvard or Stanford to run it.”

“We want people everywhere to be able to use the power of computing to get to a lead compound much faster than is possible now and with the least expense,” he added.

VirtualFlow can both prepare databases for virtual screening and conduct the screening. To get people started, Gorgulla and colleagues prepared a database of 1.4 billion chemical compounds previously compiled by the company Enamine and loaded it into VirtualFlow.

When a researcher enters information about a cell receptor or another structure in the body that a drug needs to hit—i.e., the target—VirtualFlow rapidly searches the database, testing how each of the compounds does or doesn’t bind to the target, and ranks the matches.

Researchers can run a more stringent second pass with the top candidates.

Users can then pay Enamine or other companies to produce the actual compounds so they can conduct further experiments. (Three of the study’s co-authors are employees of Enamine and one other is a scientific advisor to the company. None of the HMS authors are affiliated with Enamine.)

In addition to or instead of the Enamine database, users can prepare and screen any library of chemical compounds they desire, the authors said.

Array of benefits

Many of today’s drugs target active sites on enzymes. The team expects VirtualFlow will be particularly valuable for aiming at a different target: protein-protein interactions, “a barely touched target space” that has been hard to nail, said co-senior author Wagner, the Elkan Blout Professor of Biological Chemistry and Molecular Pharmacology at HMS.

Because successful protein-protein inhibitors are less likely than enzyme inhibitors to suffer from development of treatment resistance, said Wagner, the new software could “drive a new era of drug discovery.”

The authors hope that VirtualFlow will help push new drugs over the valley of death by reducing the time and cost of these early steps and raising the likelihood that drug candidates not only reach clinical trials but also prove safe and effective.

“We hope that uncovering better matches through our virtual screening platform leads to less money and time spent in the wet-lab stage, less toxicity in preclinical trials and ultimately fewer side effects in patients,” said Gorgulla.

His motivating force: “I want to help heal as many diseases as possible.”

And there’s still room to grow, the authors said.

“Even 1 billion compounds is a drop in the ocean,” said Gorgulla. “The total chemical space is vast. There are at least ten to the sixtieth power small molecules suitable for drug discovery, or as many as there are atoms in the Milky Way.”

This research was supported by the National Institutes of Health (grant CA200913), the Claudia Adams Barr Program for Innovative Cancer Research at Dana Farber and several fellowships. The ICCB-Longwood Screening Facility, East Quad NMR Facility and Center for Macromolecular Interactions at HMS also contributed to the study.