Personal Genetics and the Law

pgEd briefs Congress on uses of DNA in the criminal justice system

The need for a better understanding of personal genetics has never been more urgent. That was the message an expert panel of speakers relayed in a Congressional briefing on the intersection of personal genetics and law enforcement.

“There is no time to lose,” said Lauren Tomaselli, director of curriculum and training for the Personal Genetics Education Project (pgEd) at Harvard Medical School, citing a recent appeal to the Supreme Court on a ruling that allows a person’s DNA to be collected and tested without their knowledge or permission. The case was declined by the court.

Get more HMS news here.

The pgEd leaders organized the March 19 briefing on Capitol Hill in cooperation with the offices of U.S. Rep. Louise Slaughter, D-N.Y., and Sen. Elizabeth Warren, D-Mass.

The mission of pgED is to educate young people through school programs and to accelerate public awareness of genetics issues by advising the entertainment industry. It also seeks to engage lawmakers—the “eyes and ears of the nation”— in discussions.

pgEd takes no position on policy, preferring to educate from a neutral position so that its audience can make better-informed decisions.Duana Fullwiley. Image: Mark Finkenstaedt

At the briefing, Duana Fullwiley, associate professor of anthropology at Stanford University, said in some cases genetic technologies that are being utilized by the U.S. criminal justice system are leapfrogging not just public understanding but also peer-reviewed scientific evaluation.

One case in point: DNA phenotyping, a tool that generates the image of a human face based on genetic samples that have been taken from a crime scene.

Police in Columbia, South Carolina, recently relied on such an image provided by Parabon NanoLabs as they searched for suspects in a double murder.

The science behind this service, called Snapshot, has not been analyzed by people outside the company, Fullwiley said.

She criticized the database on which Snapshot is based for two reasons, saying it skews toward an over-representation of African-Americans and its results offer a false sense of precision.

Focusing on a single type of suspect can implicate a whole group, she said, citing the generic image of a young man with dark hair, eyes and skin.

“When, as a society, we are already dealing with racial bias in policing and civil rights, we have to be very careful about rolling out technologies that can potentially have racial impacts that are disparate for different groups,” she said. 

David Kaye. Image: Mark Finkenstaedt

David Kaye, associate dean for research at Penn State Law, said DNA screening in criminal investigations is often racially based because it relies on witness accounts.

He asked, “If you use the information at your disposal, is it truly discriminatory?” 

Courts have also allowed involuntary collection of genetic samples, even through subterfuge, he said. 

For example, he said detectives duped a suspect into replying to a letter that offered money via a class-action suit. DNA that was recovered from the paper form was returned to a fabricated law office created by the detectives. In another instance, he said a case was built against a serial killer based on DNA retrieved from his daughter’s Pap smear.

The Microbe Question

Claire Fraser, director of the Institute for Genome Sciences at the University of Maryland School of Medicine, explained how microbial DNA might one day be used for forensic purposes. Her past work identified genetic mutations in anthrax spores in the deadly 2001 anthrax mailing.  That laid the foundation for the new field of microbial forensics.

Claire Fraser. Image: Mark Finkenstaedt“Mother Nature is the best bioterrorist,” she said, using SARS, Ebola and West Nile virus as examples.

The microbes we carry with us, collectively known as our microbiomes, could potentially be used as identifiers, she said, but added that that day is far in the future.

Henry Greely, director of the Center for Law and the Biosciences at Stanford Law School, said he worries about the ethnic disproportion in the database of 11 million records now held by federal and state law enforcement.

“There is a much higher chance for a black American than a white American” to be implicated by a family member’s DNA sample, he said. “That’s troubling.”

While it would be politically difficult, Greely said, he would prefer to see a system in which all Americans would have their samples included in a federal database, making it more representative of the nation. He did concede that privacy could be a problem. If privacy were breached, he said, public trust in law enforcement and in genetics would suffer.

Genetic Privacy Rights

Henry Greely. Image: Mark Finkenstaedt

Slaughter is a longtime champion of genetic privacy, having sponsored a bill that in 2000 became the Genetic Information Nondiscrimination Act, also known as GINA. She was introduced at the briefing as “the only microbiologist in Congress.”

“GINA was all about privacy,” she said, recalling the battle for its passage. “We wanted to make sure that the social policy kept up with science, but science fiction intervened. Everybody thought we were talking about cloning.” 

Protecting genetic information in the workplace and for insurance purposes is still an urgent issue, Slaughter said.

U.S. Rep. Louise Slaughter. Image: Mark Finkenstaedt“Your genetics belongs to you and the information is yours,” she said to applause from the audience, which included congressional staffers as well as people from the U.S. Department of Justice, the FBI, the National Institutes of Health, the American Society of Human Genetics and the American Association for the Advancement of Science.

In the discussion that followed the speakers’ presentations, Ting Wu, HMS professor of genetics and a founder of pgEd, asked if somehow racial discrimination could be minimized.

“Obviously it’s a problem,” she said. “We can think of Ferguson and see where that goes."

Wu, who founded pgEd in 2006, said she feels a deep responsibility to educate people about genetics. She has said it’s not a choice but a necessity.

In an interview after the briefing, George Church, the Robert Winthrop Professor of Genetics at HMS, raised the issue of “DNA exceptionalism,” in which genetic tools are seen as different from other modalities, and not just in jurisprudence.

In medicine, for example, gene therapy is viewed as an extraordinary category of treatment.

The pace of public understanding and scientific advancement are not moving in step, he said.

“We have a long way to go, but that’s because genetics is a moving target,” Church said. George Church. Image: Mark Finkenstaedt

Samantha Schilit, a pgEd affiliate and a graduate student in genetics, said she hopes to pursue personal genetics as a genetic counselor. She attended the briefing as a guest of pgEd after winning a contest in her department to add the most pins to Map-Ed, an online quiz on key concepts and topics in genetics.

“What shocked me is how truly new these topics are,” she said, citing the DNA phenotyping news from South Carolina in February.

Schilit said she is uneasy about the possibility that information gathered by a direct-to-consumer company, for example, could find its way into a forensic investigation, a possibility that was raised by Greely.

“These issues are ethically complicated,” she said. “This field is moving so fast.”

The briefing was the third of five planned by pgEd. The first briefing highlighted the science of genomics, personalized medicine and genetic engineering as well as ways to reach out to the public. The second briefing focused on two topics: the role of genetics research in the unfolding Ebola outbreak in West Africa and the issues addressed by GINA. The third briefing on law enforcement grew out of topics touched on in the first two.

pgEd is supported by the HMS Department of Genetics and private funding from Sigma-Aldrich, Autodesk, Genentech, IDT (targeted specifically for GETed conferences and Map-Ed), and an anonymous donor.


NSAIDs and Cancer Risk

Genetic makeup influences whether aspirin or other NSAIDS will reduce colorectal cancer risk

An analysis of genetic and lifestyle data from 10 large epidemiologic studies has confirmed that regular use of aspirin or other nonsteroidal anti-inflammatory drugs (NSAIDs) appears to reduce the risk of colorectal cancer in most individuals.

The study, published in JAMA, also found that a few individuals with rare genetic variants do not share this benefit. Additional questions need to be answered before preventive treatment with these medications can be recommended for anyone, the study authors cautioned.

Get more HMS news here.

“Previous studies, including randomized trials, demonstrated that NSAIDS, particularly aspirin, protect against the development of colorectal cancer, but it remains unclear whether an individual’s genetic makeup might influence that benefit,” said Andrew Chan, HMS associate professor of medicine at Massachusetts General Hospital and co-senior author of the JAMA report. “Since these drugs are known to have serious side effects—especially gastrointestinal bleeding—determining whether certain subsets of the population might not benefit is important for our ability to tailor recommendations for individual patients.”

The research team analyzed data from the Colon Cancer Family Registry and from nine studies included in the Genetics and Epidemiology of Colorectal Cancer Consortium, which includes the Nurses’ Health Study, the Health Professionals Follow-up Study and the Women’s Health Initiative. They compared genetic data for 8,624 individuals who developed colorectal cancer with genetic data for 8,553 individuals who did not, matched for factors such as age and gender.

The comprehensive information on lifestyle and general health data provided by participants in the studies again confirmed that regular use of aspirin or NSAIDs was associated with a 30 percent reduction in colorectal cancer risk for most individuals. However, that preventive benefit did not apply to everyone. The study found no risk reduction in participants with relatively uncommon variants in genes on chromosome 12 and chromosome 15.

“Determining whether an individual should adopt this preventive strategy is complicated, and currently the decision needs to balance one’s personal risk for cancer against concerns about internal bleeding and other side effects,” Chan said. “This study suggests that adding information about one’s genetic profile might help in making that decision. However, it is premature to recommend genetic screening to guide clinical care, since our findings need to be validated in other populations. An equally important question that also needs to be investigated is whether there are genetic influences on the likelihood that someone might be harmed by treatment with aspirin and NSAIDs.”

Support for this study includes several grants from the National Cancer Institute and the National Institute of Diabetes and Digestive and Kidney Diseases.

Adapted from a Mass General news release.


Rett Syndrome Revelation

New study describes how gene mutations in the brain spur this debilitating condition

Scientists from Harvard Medical School have connected the single gene mutated in Rett syndrome with a surprising function. Harrison Gabel and Benyam Kinde talk about their discovery. Video: HMS OCER

Scientists have known for 15 years that mutations in a single gene lead to Rett syndrome, a severe neurological disorder that affects girls around their first birthdays. In the years since the MECP2 gene was pinpointed, researchers have struggled to understand how it functions in the brain in Rett syndrome.

Now the enigma of Rett syndrome and perhaps other disorders on the autism spectrum could be one step closer to being solved.

Get more HMS news here. 

A Harvard Medical School team has discovered that when MECP2 is mutated in Rett syndrome, the brain loses its ability to regulate genes that are unusually long. Their finding suggests new ways to consider reversing the intellectual and physical debilitation this disruption causes with a drug that could potentially target this error. The team, led by Michael Greenberg, reported its findings in Nature.

“The longer the gene, the more disrupted it becomes when you lose MECP2,” said Greenberg, the Nathan Marsh Pusey Professor of Neurobiology at HMS. “Rett syndrome may be a defect in this process of fine-tuning the expression of long genes.”

Scientists, including Greenberg, have figured out over the last 10 years that MECP2 plays a role in sculpting the connections between neurons in the developing brain. These synapses are refined by exposure to sensory experiences, just the sort of stimulation a one-year-old would encounter as she learns to walk and talk.

MECP2 is present in all cells in the body, but when the brain is forming and maturing its synapses in response to sensory input, MECP2 levels in the brain are almost 10 times as high as in other parts of the body. The new study connects MECP2 mutations to long genes, which may be more prone to errors simply because their length leaves more room for mistakes.

Speed Bump

“Normally, MECP2 may act like a speed bump, fine-tuning long genes by slowing down the machinery that transcribes long genes,” said Harrison Gabel, a postdoctoral fellow in the Greenberg lab and co-first author of the Nature paper. In transcription, the information in a strand of DNA is copied onto a new molecule of messenger RNA, which is then turned into a protein. “Without MECP2, the machinery may be moving too fast, making too much mRNA from these genes, resulting in problems for the neurons.”

Finding this effect of MECP2 on long genes was no small feat. In a typical search for the mechanism behind a genetic mutation, mice are engineered to lack the normal gene so that its absence reveals how it functions. However, work in many different labs has shown that knocking out MECP2 had only subtle effects when analyzed across the genome. The changes in gene expression were inconsistent, small and, using Gabel’s word, “fuzzy.”

Gabel took another approach, querying massive genomic databases such as ENCODE to ask a simple question: What do genes that are affected by mutated MECP2 have in common?

Answer: They are long. Most of them are at least five times longer than the average gene, with many of them more than 50 times longer than the average. It is important to note that the genes identified across dozens of data sets were very long, giving the researchers a common finding where previous conclusions from these data sets had lacked a common theme.

Harrison and co-first author Benyam Kinde, an MD-PhD student in the Greenberg lab, found the long-gene misregulation in multiple mouse models of Rett syndrome and confirmed it in the brain tissue of deceased Rett patients.

For MECP2 to function normally as a speed bump, it binds to a form of methylated DNA found in long genes in the brain. Methyl groups are chemical modifiers of gene activity, and in other parts of the body MECP2 binds methylated CG sites on genes. The methylation pattern that appears to be important for MECP2 in regulating long genes is known as methylated CA, and there appears to be a special mechanism operating as synapses are forming.

“It seems that evolution has used MECP2 and methylated CA to put in place this speed bump so that the expression of long genes is restrained in the brain,”  Greenberg said. “As far as Rett syndrome, the thought is now that this subtle but widespread overexpression of long genes might be contributing to the disorder.”

Corrective Strategy

The scientists can’t be sure of what these overexpressed long genes do, but many of them appear to be very important to the function of the brain. This suggests that if they could correct the defect in long-gene expression, they might be able to reverse at least some of the symptoms of Rett syndrome. As a first attempt at a corrective strategy, the researchers selected a cancer drug called topotecan because it blocks an enzyme known to be important for long-gene transcription.

In a lab dish, they added topotecan to neurons lacking MECP2. The drug reversed the long-gene misregulation, suggesting that restoring normal long-gene expression might be a way to correct neurological dysfunction in Rett syndrome and in other autism spectrum disorders with long genes, such as fragile X syndrome. Topotecan, a chemotherapeutic agent, is too toxic, Greenberg said, but derivatives of topotecan might be a worthwhile avenue to pursue.

“We think this issue of long-gene misregulation may be more generally occurring in other disorders of human cognition,” Greenberg said. “The potential is pretty significant because one now has a common regulatory mechanism to target with drugs.”

This work was supported by grants from the Rett Syndrome Research Trust and the National Institutes of Health (1RO1NS048276 and T32GM007753), the Damon Runyon Cancer Research Foundation (DRG-2048-10), the William Randolf Hearst fund and the Howard Hughes Medical Institute. 


Variety Show

New techniques reveal “extreme” gene copy range 

Researchers have begun to appreciate the importance of copy number variation when considering the connections between DNA and disease.

Most people have two copies of most genes. But some have only one copy, or three, or none. There have been hints that copy number variation (CNV) might range much more widely than zero to three, but such extremes have been hard to analyze in gene sequencing data.

“For all the excitement about copy number variation in human genetics, most earlier research has been limited to the simplest form of CNV, in which you have either a missing segment or an extra copy of it,” said Steven McCarroll, assistant professor of genetics at Harvard Medical School and director of genetics for the Stanley Center for Psychiatric Research at the Broad Institute of MIT and Harvard.

“Here we came up with a way to analyze extreme forms of CNV,” he said. “Now we can start to use this exuberant form of genetic variation to help illuminate the genetic basis of disease.”

Get more HMS news here.

McCarroll and colleagues reported their insights about extreme CNV in Nature Genetics on Jan. 26. Their discoveries were made possible by new computational techniques that first author Bob Handsaker developed to analyze whole-genome sequence data from thousands of genomes at once.

“Before, we had no good way to study genes that have a really high copy number, above four,” said Handsaker, a research scientist in the McCarroll lab. “Now we can find places where people’s gene copy number ranges from zero to 15. It’s the first time we’ve been able to measure this kind of variation with such precision.”

“We’ve found that in hundreds of genes, there’s a wide variation in copy numbers. Now that we can measure these variations accurately, we can ask whether there are health repercussions,” said Handsaker.

The results also enrich the understanding of human genome evolution, said McCarroll.

Once they had developed a way to study extreme CNV, Handsaker, McCarroll and their team made four primary discoveries.

First: About 88 percent of gene copy number variation among humans arises from extreme copy number variants rather than simple copy number variants.

“These extreme copy number variants are a small fraction of all CNVs, but they have broader effects on genes than we anticipated,” said McCarroll.

Second: The more copies of a gene a person has, the more that gene is expressed.

“You might think this was obvious,” said Handsaker, “but in some organisms, such as plants, when you have more copies, most of them are turned off. It turns out that in humans, they’re all turned on in almost all cases.”

Third: With simple CNV, most people have two copies, while a few outliers have one or three or none. McCarroll’s team found that with extreme CNV, most people don’t have two copies but instead have CNVs scattered across a wide range.

“For a lot of these CNVs with these especially exuberant differences, two randomly chosen people are actually more likely to have different numbers of copies than the same number,” said Handsaker.

Fourth: Sequences with more copies are more likely to mutate further, expanding in copy number quickly and dramatically.

The team found what they call “runaway duplication haplotypes,” in which some versions of a chromosome have acquired as many as 10 copies of a gene over the past thousand or so generations, while other versions of the same chromosome continue to have just one copy.

“The fast, dramatic expansion in copy number of specific genes appears to have been evolutionarily recent and geographically localized,” said McCarroll.

One gene involved in resistance to trypanosomes—parasites that cause human illnesses including sleeping sickness and Chagas disease—evolved to have a high copy number on a subset of the chromosomes in West African populations. Another gene, related to a gene that contributes to asthma resistance, evolved to have a high copy number in Europe.

“These variations show really unusual patterns in some parts of the world,” said McCarroll. “But it’s too soon to know whether they’re doing something important.”

The team is now offering to the research community “the first data resource on extreme forms of CNV and how they actually vary across a large number of people” as well as a software toolkit to analyze extreme CNV in huge sequencing data sets, McCarroll said.

“Until recently, whole-genome sequencing was quite expensive. Today, that’s changing quickly,” McCarroll added. “This work gives us a sense of the kinds of things it’s going to be possible to see in whole-genome sequences that it wasn’t possible to see before.”

Coauthor Jennifer R. Berman is an employee of Bio-Rad Inc.

This research was supported by National Human Genome Research Institute grant R01 HG006855. Additional funding from NHGRI (U01 HG006510) is supporting follow-on work to develop production-ready software that can be used by any research laboratory.


No Escape

Biological safety lock for genetically modified organisms 

The creation of genetically modified and entirely synthetic organisms continues to generate excitement as well as worry.

Such organisms are already churning out insulin and other drug ingredients, helping produce biofuels, teaching scientists about human disease and improving fishing and agriculture. While the risks can be exaggerated to frightening effect, modified organisms do have the potential to upset natural ecosystems if they were to escape.

Physical containment isn’t enough. Lab dishes and industrial vats can break; workers can go home with inadvertently contaminated clothes. And some organisms are meant for use in open environments, such as mosquitoes that can’t spread malaria.

So attention turns to biocontainment: building in biological safeguards to prevent modified organisms from surviving where they’re not meant to. To do so, geneticists and synthetic biologists find themselves taking a cue from safety engineers.

“If you make a chemical that’s potentially explosive, you put stabilizers in it. If you build a car, you put in seat belts and airbags,” said George Church, Robert Winthrop Professor of Genetics at Harvard Medical School and core faculty member at the Wyss Institute.

And if you’ve created the world’s first genomically recoded organism, a strain of Escherichia coli with a radically changed genome, as Church’s group announced in 2013, you make its life dependent on something only you can supply.

Get more HMS news here.

Church and colleagues report Jan. 21 in Nature that they further modified their 2013 E. coli to incorporate a synthetic amino acid in many places throughout their genomes. Without this amino acid, the bacteria can’t perform the vital job of translating their RNA into properly folded proteins.

The E. coli can’t make this unnatural amino acid themselves or find it anywhere in the wild; they have to eat it in specially cooked-up lab cultures.

A separate team reports in Nature that it was able to engineer the same strain of E. coli to become dependent on a synthetic amino acid using different methods. That group was led by a longtime collaborator of Church’s, Farren Isaacs of Yale University.

The two studies are the first to use synthetic nutrient dependency as a biocontainment strategy, and suggest that it might be useful for making genetically modified organisms safer in an open environment.

In addition, “We now have the first example of genome-scale engineering rather than gene editing or genome copying,” said Church. “This is the most radically altered genome to date in terms of genome function. We have not only a new code, but also a new amino acid, and the organism is totally dependent on it.”

Church’s team, led by first authors Dan Mandell and Marc Lajoie, HMS research fellows in genetics, also made the E. coli resistant to two viruses, with plans to expand that list.

The modifications offer theoretically safer E. coli strains that could be used in biotechnology applications with less fear that they will be contaminated by viruses, which can be financially disastrous, or cause ecological trouble if they spill. (E. coli is one of the main organisms used in industry.)

Hooked on amino acids

Scientists have been exploring two main biocontainment methods, but each has weaknesses. Church was determined to fix them.

One method involves turning normally self-sufficient organisms like E. coli into auxotrophs, which can’t make certain nutrients they need for growth. Humans are auxotrophs, which is why we need to include vitamins and other “essential” nutrients in our diets.

Altering the genetics of E. coli so they can’t make a naturally occurring nutrient doesn’t always work, said Church, because some of them manage to scavenge the nutrient from their surroundings. He lowered that risk by making the E. coli dependent on a nutrient not found in nature.

Another pitfall of making auxotrophs is that some E. coli could evolve a way to synthesize the nutrient they need. Or they could acquire the ability while exchanging bits of DNA with other E. coli in a process called horizontal gene transfer.

Church believes his team protected against those possibilities because it had to make 49 genetic changes to the E. coli to make them dependent on the artificial nutrient. The chance one of the bacteria could randomly undo all of those changes without also acquiring a harmful mutation, he said, is incredibly slim.

Church’s solution also took care of concerns he had with another biocontainment technique, in which genetic “kill switches” make bacteria vulnerable to a toxin so spills can be quickly neutralized. “All you have to do to kill a kill switch is turn it off,” which can be done in any number of ways, Church said. Routing around the dependency on the artificial amino acid is much harder.

Church determined that another key to making a successful “synthetic auxotroph” was to ensure that the E. coli’s lives depended on the artificial amino acid. Otherwise, escaped E. coli could keep rolling along even if they couldn’t make or scavenge it. So his group targeted proteins that drive the essential functions of the cell.

“If you put it off on the periphery, like on the paint job of your car, the car will still run,” he explained. “You have to embed the dependency smack in the middle of the engine, like the crank shaft, so it now has a particular part you can only get from, say, one manufacturer in Europe.”

Building a safer bacterium

The need to choose a process essential to E. coli survival and a nutrient not found in nature “limited us to a small number of genes,” Church said. His team used computational tools to design proteins that might cause the desired “irreversible, inescapable dependency.” They took the best candidates, synthesized them and tested them in actual E. coli.

They ended up with three successful redesigned essential proteins and two dependent E. coli strains. “Using three proteins together is more powerful than using them separately,” Church said. He envisions future E. coli modified to require even more synthetic amino acids to make escape virtually impossible.

As it was, the escape rate—the number of E. coli able to survive without being fed the synthetic amino acid—was “so low we couldn’t detect it,” Church said.

The group grew a total of 1 trillion E. coli cells from various experiments, and after two weeks none had escaped. “That’s 10,000 times better than the National Institutes of Health’s recommendation for escape rate for genetically modified organisms,” said Church.

The weaknesses in Church’s methods remain to be seen. For now, he is satisfied with the results his group has obtained by pushing the limits of available testing.

“As part of our dedication to safety engineering in biology, we’re trying to get better at creating physically contained test systems to develop something that eventually will be so biologically contained that we won’t need physical containment anymore,” said Church.

In the meantime, he said, “we can use the physical containment to debug it and make sure it actually works.”

This work was funded by the U.S. Department of Energy (grant DE-FG02-02ER63445).


Gene-Editing Guide

New method identifies genome-wide off-target effects of CRISPR-Cas

Harvard Medical School investigators at Massachusetts General Hospital have developed a method for detecting unwanted DNA breaks—across the entire genome of human cells—induced by the popular gene-editing tools called CRISPR-Cas RNA-guided nucleases (RGNs). 

Members of the same team that first described these off-target effects in human cells describe their new platform, called GUIDE-seq (Genome-wide Unbiased Identification of Double-stranded breaks Evaluated by Sequencing) in a report published in Nature Biotechnology.

“GUIDE-seq is the first genome-wide method of sensitively detecting off-target DNA breaks induced by CRISPR-Cas nucleases that does not start with the assumption that these off-target sites resemble the targeted sites,” said J. Keith Joung, HMS associate professor of pathology at Mass General and senior author of the paper. “This capability, which did not exist before, is critically important for the evaluation of any clinical use of CRISPR-Cas RNA-guided nucleases.”

Get more HMS news here

Used to cut through a double strand of DNA in order to introduce genetic changes, CRISPR-Cas RNA-guided nucleases combine a bacterial gene-cutting enzyme called Cas9 with a short RNA segment that matches and binds to the target DNA sequence. In a 2013 Nature Biotechnology paper, Joung and his colleagues reported finding that CRISPR-Cas RNA-guided nucleases could also induce double-strand breaks at sites with significant differences from the target site, including mismatches of as many as five nucleotides. 

Because such off-target mutations could potentially lead to adverse effects, including cancer, the ability to identify and eventually minimize unwanted double-strand breaks would be essential to the safe clinical use of these RNA-guided nucleases, the authors noted.

The method they developed involves using short, double-stranded oligonucleotides that are taken up by double-strand breaks in a cell’s DNA, acting as markers of off-target breaks caused by the use of CRISPR-Cas. Those tags allow the identification and subsequent sequencing of those genomic regions, pinpointing the location of off-target mutations. 

Experiments with GUIDE-seq showed it was sensitive enough to detect off-target sites at which CRISPR RNA-guided nucleases induced unwanted mutations of a gene that occur with a frequency of as little as 0.1 percent in a population of cells. These experiments also revealed that no easy rules would predict the number or location of off-target double-strand breaks, since many such mutations took place at sites quite dissimilar from the targeted site. 

Two existing tools, designed to predict off-target mutations by analysis of the target sequence, were much less effective than GUIDE-seq in predicting confirmed off-target sites and also misidentified sites that did not prove to have been cut by the enzyme. Comparing GUIDE-seq with a tool called ChIP-seq, which identifies sites where proteins bind to a DNA strand, confirmed that ChIP-seq does not provide a robust method for identifying CRISPR-Cas-induced double-strand breaks.

GUIDE-seq was also able to identify breakpoint hotspots in control cell lines that were not induced to express the CRISPR RNA-guided nucleases. 

“Various papers have described fragile genomic sites in human cells before,” Joung noted, “but this method may be the first to identify these sites without the addition of drugs that enhance the occurrence of such breaks. We also were surprised to find those breaks occurred largely at different sites in the two cell lines used in this study. The ability to capture these RNA-guided nuclease-independent breaks suggests that GUIDE-seq could be a useful tool for studying and monitoring DNA repair in living cells.”

In addition, GUIDE-seq was able to verify that their approach for improving the accuracy of CRISPR-Cas by shortening the guiding RNA segment reduced the number of double-strand breaks throughout the genome. Joung also expects that GUIDE-seq will be useful in identifying off-target breaks induced by other gene-editing tools. 

Along with pursuing that possibility, Joung noted the importance of investigating the incidence and detection of off-target mutations in human cells not altered to create cell lines—a process that transforms them into immortalized cancer cells. Understanding the range and number of off-target mutations in untransformed cells will give a better picture of how CRISPR-Cas RNA-guided nucleases and other tools would function in clinical applications.

“The GUIDE-seq method is very straightforward to perform, and we intend to make the software for analyzing sequencing data available online to noncommercial researchers at in the near future,” adds Joung. 

A patent application covering the GUIDE-seq technology has been filed.

Support for the study includes National Institutes of Health (NIH) Director’s Pioneer Award DP1 GM105378; NIH grants R01 GM088040, R01 AR063070 and F32 GM105189; the Jim and Ann Orr Massachusetts General Hospital Research Scholar Award; and Defense Advanced Research Project Agency grant W911NF-11-2-0056.

Adapted from a Mass General news release.


Warning Signs

Two studies identify pre-cancerous state in the blood 

Image: bubaone/iStock

Researchers from the Broad Institute of MIT and Harvard, Harvard Medical School and Harvard-affiliated hospitals have uncovered an easily detectable, “pre-malignant” state in the blood that significantly increases the likelihood that a person will go on to develop blood cancers such as leukemia, lymphoma or myelodysplastic syndrome.

The discovery, which was made independently by two research teams affiliated with the Broad and partner institutions, opens new avenues for research into early detection and prevention of blood cancer. Findings from both teams appear this week in the New England Journal of Medicine.

Get more HMS news here.

Most genetic research on cancer to date has focused on studying the genomes of advanced cancers, to identify the genes that are mutated in various cancer types. These two new studies instead looked at somatic mutations—mutations that cells acquire over time as they replicate and regenerate within the body—in DNA samples collected from the blood of people not known to have cancer or blood disorders. 

Taking two very different approaches, the teams found that a surprising percentage of those sampled had acquired a subset—some but not all—of the somatic mutations that are present in blood cancers. These people were more than ten times more likely to go on to develop blood cancer in subsequent years than those in whom such mutations were not detected.

The “pre-malignant” state identified by the studies becomes more common with age; it is rare in those under the age of 40, but appears with increasing frequency with each decade of life that passes, ultimately appearing in more than 10 percent of those over the age of 70.

Carriers of the mutations are at an overall 5 percent risk of developing some form of blood cancer within five years.

This “pre-malignant” stage can be detected simply by sequencing DNA from blood.

“People often think about disease in black and white—that there’s ‘healthy’ and there’s ‘disease’—but in reality most disease develops gradually over months or years. These findings give us a window on these early stages in the development of blood cancer,” said Steven McCarroll, senior author of one of the papers.

McCarroll is assistant professor of genetics at HMS and director of genetics at the Broad’s Stanley Center for Psychiatric Research.

Benjamin Ebert, HMS associate professor of medicine at Brigham and Women’s Hospital and an associate member of the Broad, is the senior author of the other paper.

The mutations identified by both studies are thought to originate in blood stem cells, and confer a growth-promoting advantage to the mutated cell and all of its “clones”—cells that derive from that original stem cell during the normal course of cell division. These cells then reproduce at an accelerated rate until they account for a large fraction of the cells in a person’s blood.

The researchers believe these early mutations lie in wait for follow-on, “cooperating” mutations that, when they occur in the same cells as the earlier mutations, drive the cells toward cancer. The majority of mutations occurred in just three genes; DNMT3A, TET2, and ASXL1.

“Cancer is the end stage of the process,” said Siddhartha Jaiswal, a Broad associated scientist and HMS clinical fellow at Massachusetts General Hospital who was first author of Ebert’s paper. “By the time a cancer has become clinically detectable it has accumulated several mutations that have evolved over many years. What we are primarily detecting here is an early, pre-malignant stage in which the cells have acquired just one initiating mutation.”

The teams converged on these findings through very different approaches.

Ebert’s team had hypothesized that, since blood cancers increase with age, it might be possible to detect early somatic mutations that could be initiating the disease process, and that these mutations also might increase with age. They looked specifically at 160 genes known to be recurrently mutated in blood malignancies, using genetic data derived from approximately 17,000 blood samples originally obtained for studies on the genetics of type 2 diabetes.

They found that somatic mutations in these genes did indeed increase the likelihood of developing cancer, and they saw a clear association between age and the frequency of these mutations. They also found that men were slightly more likely to have mutations than women, and Hispanics were slightly less likely to have mutations than other groups.

Ebert’s team also found an association between the presence of this “pre-malignant” state and the risk of overall mortality independent of cancer. People with these mutations had a higher risk of type 2 diabetes, coronary heart disease and ischemic stroke as well. Additional research will be needed to determine the nature of these associations.

McCarroll’s team discovered the phenomenon while studying a different disease. They, too, were looking at somatic mutations, but they were initially interested in determining whether such mutations contributed to risk for schizophrenia. The team studied roughly 12,000 DNA samples drawn from the blood of patients with schizophrenia and bipolar disorder, as well as healthy controls, searching across the whole genome at all of the protein-coding genes for patterns in somatic mutations.

They found that the somatic mutations were concentrated in a handful of genes. The scientists quickly realized they were cancer genes. The team then used electronic medical records to follow the patients’ subsequent medical histories, finding that the subjects with these acquired mutations had a 13-times elevated risk of blood cancer.

McCarroll’s team conducted follow-up analyses on tumor samples from two patients who had progressed from this pre-malignant state to cancer. These genomic analyses revealed that the cancer had indeed developed from the same cells that had harbored the “initiating” mutations years earlier.

“The fact that both teams converged on strikingly similar findings, using very different approaches and looking at DNA from very different sets of patients, has given us great confidence in the results,” said Giulio Genovese, a computational biologist at the Broad and first author of McCarroll’s paper. “It has been gratifying to have this corroboration of each other’s findings.”

Jaiswal will present the findings on Dec. 9 at the American Society of Hematology Annual Meeting in San Francisco.

All of the researchers involved emphasized that there is no clinical benefit today for testing for this pre-malignant state; there are no treatments currently available that would address this condition in otherwise healthy people. However, they say the results open the door to entirely new directions for blood cancer research, toward early detection and even prevention.

“The results demonstrate a way to identify high-risk cohorts—people who are at much higher than average risk of progressing to cancer—which could be a population for clinical trials of future prevention strategies,” McCarroll said. “The abundance of these mutated cells could also serve as a biomarker—like LDL cholesterol is for cardiovascular disease—to test the effects of potential prevention therapies in clinical trials.” 

Ebert agrees:

“A new focus of investigation will now be to develop interventions that might decrease the likelihood that individuals with these mutations will go on to develop overt malignancies, or therapeutic strategies to decrease mortality from other conditions that may be instigated by these mutations,” he said.

The researchers also say that the findings show just how important it is to collect and share large datasets of genetic information: Both studies relied on DNA samples collected for studies completely unrelated to cancer.

“These two papers are a great example of how unexpected and important discoveries can be made when creative scientists work together and with access to genomic and clinical data,” said Broad deputy director David Altshuler, HMS professor of genetics at Massachusetts General Hospital and one of Ebert’s co-authors.

“For example,” Altshuler said, “Steve’s team found stronger genetic relationships to cancer than they have yet found for the schizophrenia endpoint that motivated their original study. The pace of discovery can only accelerate if researchers have the ability to apply innovative methods to large datasets.”

McCarroll’s team was supported by the Stanley Center for Psychiatric Research, the National Human Genome Research Institute (NHGRI) and the National Institute of Mental Health (NIMH). Ebert’s team was funded by the National Institutes of Health (NIH), the Gabrielle’s Angel Foundation and the Leukemia and Lymphoma Society.

Genetic data for Ebert’s paper was collected with support from the NIH (T2D-GENES, Longevity Genes Project); the Medical Research Council and Wellcome Trust (Go-T2D); the Slim Initiative for Genomic Medicine in the Americas; and NHGRI and the National Heart, Lung, and Blood Institute and the National Institute on Minority Health and Health Disparities (Jackson Heart Study).

Adapted from a Broad Institute of MIT and Harvard news release.


Marching to Our Own Sequences

Study finds DNA replication timing varies among people

Steven McCarroll and Amnon Koren describe their surprising findings and how they made their discovery by tapping into an existing online database of genome sequencing data. Video: Rick Groleau and Stephanie Dutchen

Imagine being asked to copy a library of books. Doing it yourself would take forever. You’d probably call some friends and come up with a plan to divide and conquer.

That’s what a human cell does when faced with the task of replicating six billion letters of DNA each time it divides. Instead of reading each chromosome in one slow pass, DNA replication machinery dives in at many origin points. Some segments get copied earlier or later than others.

A new study from geneticists at Harvard Medical School and the Broad Institute of Harvard and MIT has found that this replication plan—including where the origin points are and in what order DNA segments get copied—varies from person to person.

The study, published online Nov. 13 in Cell, also identifies the first genetic variants that orchestrate replication timing.

“Everyone’s cells have a plan for copying the genome. The idea that we don’t all have the same plan is surprising and interesting,” said Steven McCarroll, assistant professor of genetics at HMS, director of genetics for the Stanley Center for Psychiatric Research at the Broad and senior author of the paper.

“It’s a new form of variation in people no one had expected,” said first author Amnon Koren, postdoctoral fellow at HMS and the Broad. “That’s very exciting.”

Hidden orchestrator

DNA replication is one of the most fundamental cellular processes, and any variation among people is likely to affect genetic inheritance, including individual disease risk as well as human evolution, the authors said.

It’s been known that replication timing affects mutation rates; DNA segments that are copied late or too early tend to have more errors. The new study indicates that people with different timing programs therefore have different patterns of mutation risk across their genomes.

For example, McCarroll’s team found that differences in replication timing could explain why some people are more prone than others to certain blood cancers.

Researchers had previously known that acquired mutations in the gene Janus kinase 2, or JAK2, lead to these cancers. They had also noticed that people with such JAK2 mutations tend to have a distinctive set of inherited genetic variants nearby, but they weren’t sure how the inherited variants and the new mutations were connected. McCarroll’s team found that the inherited variants are associated with an “unusually early” replication origin point and proposed that JAK2 is more likely to develop mutations in people with that very early origin point.

“Replication timing may be a way that inherited variation contributes to the risk of later mutations and diseases that we usually think of as arising by chance,” said McCarroll.

Untapped riches

McCarroll, Koren and colleagues were able to make these discoveries in large part because they invented a new way to obtain DNA replication timing data. Turned out, it was hiding in plain sight.

Until now, to study replication timing, scientists needed to painstakingly “grow cells for a couple of weeks and sort them with a special machine and do a big, complicated, expensive, time-consuming experiment”—all to obtain material from just a few people at a time, said Koren.

The team suspected there was an easier way. They turned to the 1000 Genomes Project, which maintains an online database of sequencing data collected from hundreds of people around the world.

Because much of the DNA in the 1000 Genomes Project had been extracted from actively dividing cells, the team hypothesized that information about replication timing lurked within.

They were right. They counted the number of copies of individual genes in each genome. Because early replication origins had created more segment copies at the time the sample was taken than late replication origins had, they were able to create a personalized replication timing map for each person.

“People had seen these patterns before, but just dismissed them as artifacts of sequencing technology,” said McCarroll. After conducting numerous tests to rule out that possibility, “we found that they reflect real biology.”

The researchers then compared each person’s copy number information with his or her genetic sequence data to see if they could match specific genetic variants to replication timing differences. From 161 samples, they identified 16 variants. The variants were short, and most were common.

“I think this is the first time we can pinpoint genetic influences on replication timing in any organism,” said Koren.

The variants were located near replication origin points, leading the team to wonder if they affect replication timing by altering where a person’s origin points are. They also suspect that the variants work by altering chromatin structure, exposing local sequences to replication machinery. The team intends to find out. They also want to search for additional variants that control replication timing.

“These 16 variants are almost certainly just the tip of the iceberg,” said Koren.

The door is open

As more variants come to light in future studies, researchers will be better able to manipulate replication timing in the lab and learn more about how it works and what its biological significance is.

Such studies should flourish now that the team has shown that “all you need to do to study replication timing is grow cells and sequence their DNA, which everyone is doing these days,” said Koren. The new method “is much easier, faster and cheaper, and I think it will transform the field because we can now do experiments in large scale.”

“We found that there is biological information in genome sequence data,” added McCarroll. “But this was still an accidental biological experiment. Now imagine the results when we and others actually design experiments to study this phenomenon.”

This research was funded by the National Human Genome Research Institute (R01 HG 006855), the Integra-Life Seventh Framework Programme (grant #315997), the Stanley Center for Psychiatric Research, the Howard Hughes Medical Institute and the Harvard Stem Cell Institute.


New Branch Added to European Family Tree

Genetic analysis reveals Europeans descended from at least three ancient groups

This skull of a 7,000-year-old German farmer was among the ancient human bones that revealed more about the genetic heritage of present-day Europeans. Image: Joanna Drath/University of Tübingen

The setting: Europe, about 7,500 years ago.

Agriculture was sweeping in from the Near East, bringing early farmers into contact with hunter-gatherers who had already been living in Europe for tens of thousands of years.

Genetic and archaeological research in the last 10 years has revealed that almost all present-day Europeans descend from the mixing of these two ancient populations. But it turns out that’s not the full story.

Researchers at Harvard Medical School and the University of Tübingen in Germany have now documented a genetic contribution from a third ancestor: Ancient North Eurasians. This group appears to have contributed DNA to present-day Europeans as well as to the people who travelled across the Bering Strait into the Americas more than 15,000 years ago.

“Prior to this paper, the models we had for European ancestry were two-way mixtures. We show that there are three groups,” said David Reich, professor of genetics at HMS and co-senior author of the study.

“This also explains the recently discovered genetic connection between Europeans and Native Americans,” Reich added. “The same Ancient North Eurasian group contributed to both of them.”

The research team also discovered that ancient Near Eastern farmers and their European descendants can trace much of their ancestry to a previously unknown, even older lineage called the Basal Eurasians.

The study was published online Sept. 17 in Nature.

Peering into the past

To probe the ongoing mystery of Europeans’ heritage and their relationships to the rest of the world, the international research team—including co-senior author Johannes Krause, professor of archaeo- and paleogenetics at the University of Tübingen and co-director of the new Max Planck Institute for History and the Sciences in Jena, Germany—collected and sequenced the DNA of more than 2,300 present-day people from around the world and of nine ancient humans from Sweden, Luxembourg and Germany.

The ancient bones came from eight hunter-gatherers who lived about 8,000 years ago, before the arrival of farming, and one farmer from about 7,000 years ago.

The researchers also incorporated into their study genetic sequences previously gathered from ancient humans of the same time period, including early farmers such as Ötzi “the Iceman.”

“There was a sharp genetic transition between the hunter-gatherers and the farmers, reflecting a major movement of new people into Europe from the Near East,” said Reich.

Ancient North Eurasian DNA wasn’t found in either the hunter-gatherers or the early farmers, suggesting the Ancient North Eurasians arrived in the area later, he said.

“Nearly all Europeans have ancestry from all three ancestral groups,” said Iosif Lazaridis, a research fellow in genetics in Reich’s lab and first author of the paper. “Differences between them are due to the relative proportions of ancestry. Northern Europeans have more hunter-gatherer ancestry—up to about 50 percent in Lithuanians—and Southern Europeans have more farmer ancestry.”

Lazaridis added, “The Ancient North Eurasian ancestry is proportionally the smallest component everywhere in Europe, never more than 20 percent, but we find it in nearly every European group we’ve studied and also in populations from the Caucasus and Near East. A profound transformation must have taken place in West Eurasia” after farming arrived.

When this research was conducted, Ancient North Eurasians were a “ghost population”—an ancient group known only through the traces it left in the DNA of present-day people. Then, in January, a separate group of archaeologists found the physical remains of two Ancient North Eurasians in Siberia. Now, said Reich, “We can study how they’re related to other populations.”

Room for more

The team was able to go only so far in its analysis because of the limited number of ancient DNA samples. Reich thinks there could easily be more than three ancient groups who contributed to today’s European genetic profile.

He and his colleagues found that the three-way model doesn’t tell the whole story for certain regions of Europe. Mediterranean groups such as the Maltese, as well as Ashkenazi Jews, had more Near East ancestry than anticipated, while far northeastern Europeans such as Finns and the Saami, as well as some northern Russians, had more East Asian ancestry in the mix.

The most surprising part of the project for Reich, however, was the discovery of the Basal Eurasians.

“This deep lineage of non-African ancestry branched off before all the other non-Africans branched off from one another,” he said. “Before Australian Aborigines and New Guineans and South Indians and Native Americans and other indigenous hunter-gatherers split, they split from Basal Eurasians. This reconciled some contradictory pieces of information for us.”

Revised flow chart of European ancestry incorporating the new data about Ancient North Eurasians (ANE), West European hunter-gatherers (WHG), early European farmers (EEF) and Basal Eurasians. Image courtesy David Reich

Next, the team wants to figure out when the Ancient North Eurasians arrived in Europe and to find ancient DNA from the Basal Eurasians.

“We are only starting to understand the complex genetic relationship of our ancestors,” said co-author Krause. “Only more genetic data from ancient human remains will allow us to disentangle our prehistoric past.”

“There are important open questions about how the present-day people of the world got to where they are,” said Reich, who is a Howard Hughes Medical Investigator. “The traditional way geneticists study this is by analyzing present-day people, but this is very hard because present-day people reflect many layers of mixture and migration.

“Ancient DNA sequencing is a powerful technology that allows you to go back to the places and periods where important demographic events occurred,” he said. “It’s a great new opportunity to learn about human history.”

This project was supported in part by the National Cancer Institute (HHSN26120080001E and NIH/NCI Intramural Research Program), National Institute of General Medical Sciences (GM100233 and GM40282), National Human Genome Research Institute (HG004120 and HG002385), an NIH Pioneer Award (8DP1ES022577-04), National Science Foundation (HOMINID awards BCS-1032255 and BCS-0827436 and grant OCI-1053575), Howard Hughes Medical Institute, German Research Foundation (DFG) (KR 4015/1-1), Carl-Zeiss Foundation, Baden Württemberg Foundation and the Max Planck Society.


Genes and Immunity

T lymphocyte activation integrates immunologic and genetic history

Image: National Human Genome Research Institute

Researchers from Harvard Medical School and the Broad Institute of MIT and Harvard have uncovered unexpectedly complex patterns in the T lymphocyte responses that individual people mount, reflecting environmental influences as well as a genetic component. The study lays the groundwork for further explorations into the relative contributions of genes and their environment on immunological processes, the scientists said, which could illuminate autoimmune disease and its genetic underpinnings.

The findings are reported in Science and stem from the ImmVar Project, a wide-ranging analysis of variation in gene expression in the immune system. Christophe Benoist, Morton Grove-Rasmussen Professor of Immunohematology at HMS, and Aviv Regev, a Broad Institute core member, an associate professor at MIT, and  Howard Hughes Medical Institute investigator, led the third and final phase, which focused on CD4+ T cells, immune cells that are major players in autoimmune disease.

In this study, after the scientists accounted to the best possible extent for environmental influences and immunological history, they still found that the ancestry of the donor significantly affected T cell responses. “There is a signature of variation in adaptive immune response,” Benoist said. “In general, there is stronger activation of some genes in people of African ancestry, in particular for a type of response in T helper 17 (Th17) cells that tend to protect us from microbes that enter airways or the intestinal tract.  Those responses are also highly involved in autoimmune disease.” 

"The combination of careful immunological work, high-throughout assays, and sophisticated analytics essential to dissect such a complex system could only have happened within the partnership of the ImmVar consortium, bringing together the expertise of immunologists and clinicians in the Harvard-affiliated hospitals with genomics and computational experts at the Broad and MIT," Regev said.

In autoimmune diseases such as rheumatoid arthritis, inflammatory bowel disease and multiple sclerosis, immune cells mistakenly attack the body’s own tissues as if they were invaders. In healthy people, the immune system achieves a state of tolerance, quelling defensive measures after a threat has abated.

Scientists have previously identified genes that are important in controlling the autoimmune response, but this is the first time that differences in T cell activation between population groups have been revealed.

In the current study, the scientists analyzed blood samples collected from 348 healthy volunteers representing African, Asian or European ancestry. After the researchers genotyped the samples and isolated CD4+ T cells, the T cells were activated in cell culture to model their response to antigens. A computational analysis measured which genes were turned on or off in the cells from each person.

Activation of autoimmune-associated genes can vary between individuals in a complicated interplay of genes and environment. Each person’s immunological history is written in a constellation of events, from being vaccinated against the measles in childhood to having the flu last winter. Benoist compares it to learning and personality: All the memories you accumulate make you who you are.

In one’s immunological history, “environment” also encompasses the microbial world people inhabit. The hygiene hypothesis holds that people who have encountered more challenges to their immune system—harmful microbes—are less likely to have the runaway response that is the hallmark of autoimmune disease. People who grow up exposed to fewer microbes may have difficulty stopping the immune response when it is no longer needed.

There is a strong inherited component to autoimmune disease, but changing one’s environments is also important, Benoist noted. People who relocate to a new region tend to acquire the frequency of autoimmune disease of where they are going, observational research has reported. For example, he said, there is little autoimmune disease in India, but people of Indian origin who have lived in the US, from an early age have about the same frequency of autoimmune disease as people of European origin who also live in the US.

One possibility is that at least some of this variation may reflect evolutionary adaptations to the pathogens people encountered during human migrations out of Africa 50,000 years ago. A more robust immune response would have been advantageous in sub-Saharan Africa but deleterious at higher latitudes, with fewer microbial pathogens.

“It’s a tantalizing idea, but it’s highly speculative,” Benoist said.

This work was supported by National Institute of General Medical Sciences grant RC2 GM093080, NIH F32 Fellowship (F32 AG043267), HHMI, and a Harry Weaver Neuroscience Scholar Award from the National Multiple Sclerosis Society (JF2138A1).