Native Americans living in the Amazon bear an unexpected genetic connection to indigenous people in Australasia, suggesting a previously unknown wave of migration to the Americas thousands of years ago, a new study has found.
“It’s incredibly surprising,” said David Reich, Harvard Medical School professor of genetics and senior author of the study. “There’s a strong working model in archaeology and genetics, of which I have been a proponent, that most Native Americans today extend from a single pulse of expansion south of the ice sheets—and that’s wrong. We missed something very important in the original data.”
Previous research had shown that Native Americans from the Arctic to the southern tip of South America can trace their ancestry to a single “founding population” called the First Americans, who came across the Bering land bridge about 15,000 years ago. In 2012, Reich and colleagues enriched this history by showing that certain indigenous groups in northern Canada inherited DNA from at least two subsequent waves of migration.
The new study, published July 21 in Nature, indicates that there’s more to the story.
Pontus Skoglund, first author of the paper and a postdoctoral researcher in the Reich lab, was studying genetic data gathered as part of the 2012 study when he noticed a strange similarity between one or two Native American groups in Brazil and indigenous groups in Australia, New Guinea and the Andaman Islands.
“That was an unexpected and somewhat confusing result,” said Reich, who is also an associate member of the Broad Institute of Harvard and MIT and a Howard Hughes Medical Investigator. “We spent a really long time trying to make this result go away and it just got stronger.”
Skoglund and colleagues from HMS, the Broad and several universities in Brazil analyzed publicly available genetic information from 21 Native American populations from Central and South America. They also collected and analyzed DNA from nine additional populations in Brazil to make sure the link they saw hadn’t been an artifact of how the first set of genomes had been collected. The team then compared those genomes to the genomes of people from about 200 non-American populations.
The link persisted. The Tupí-speaking Suruí and Karitiana and the Ge-speaking Xavante of the Amazon had a genetic ancestor more closely related to indigenous Australasians than to any other present-day population. This ancestor doesn’t appear to have left measurable traces in other Native American groups in South, Central or North America.
The genetic markers from this ancestor don’t match any population known to have contributed ancestry to Native Americans, and the geographic pattern can’t be explained by post-Columbian European, African or Polynesian mixture with Native Americans, the authors said. They believe the ancestry is much older—perhaps as old as the First Americans.
In the ensuing millennia, the ancestral group has disappeared.
“We’ve done a lot of sampling in East Asia and nobody looks like this,” said Skoglund. “It’s an unknown group that doesn’t exist anymore.”
The team named the mysterious ancestor Population Y, after the Tupí word for ancestor, “Ypykuéra.”
Reich, Skoglund and colleagues propose that Population Y and First Americans came down from the ice sheets to become the two founding populations of the Americas.
“We don’t know the order, the time separation or the geographical patterns,” said Skoglund.
Researchers do know that the DNA of First Americans looked similar to that of Native Americans today. Population Y is more of a mystery.
“About 2 percent of the ancestry of Amazonians today comes from this Australasian lineage that’s not present in the same way elsewhere in the Americas,” said Reich.
However, that doesn’t establish how much of their ancestry comes from Population Y. If Population Y were 100 percent Australasian, that would indeed mean they contributed 2 percent of the DNA of today’s Amazonians. But if Population Y mixed with other groups such as the First Americans before they reached the Americas, the amount of DNA they contributed to today’s Amazonians could be much higher—up to 85 percent.
To answer that question, researchers would need to sample DNA from the remains of a person who belonged to Population Y. Such DNA hasn’t been obtained yet. One place to look might be in the skeletons of early Native Americans whose skulls some researchers say have Australasian features. The majority of these skeletons were found in Brazil.
Reich and Skoglund think that some of the most interesting open questions about Native American population history are about the relationships among groups after the initial migrations.
“We have a broad view of the deep origins of Native American ancestry, but within that diversity we know very little about the history of how those populations relate to each other,” said Reich.
This work was supported by the National Science Foundation (HOMINID grant BCS-1032255), the National Institutes of Health (GM100233), the Simons Foundation (grant 280376), the Howard Hughes Medical Institute, the Conselho Nacional de Desenvolvimento Científico e Tecnológico and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (Brazil), the Wenner-Gren Foundation and the Swedish Research Council (VR grant 2014-453).
A team of Harvard Medical School researchers at Massachusetts General Hospital has found a way to expand the use and precision of powerful gene-editing tools called CRISPR-Cas9 RNA-guided nucleases.
In their report in Nature, the investigators describe evolved versions of the DNA-cutting Cas9 enzyme that can recognize a different range of nucleic acid sequences than is now possible with the naturally occurring form of Cas9 scientists have been using.
“In our paper we show that sites in human and zebrafish genes that could not previously be modified by wild-type Cas9 can now be targeted with the new variants we have engineered,” said Benjamin Kleinstiver, HMS research fellow in pathology at Mass General and lead author of the Nature paper. “This will allow researchers to target an expanded range of sites in a variety of genomes, which will be useful for applications requiring highly precise targeting of DNA sequences.”
CRISPR-Cas9 nucleases consist of the Cas9 bacterial enzyme and a short, 20-nucleotide RNA molecule that matches the target DNA sequence. In addition to the RNA/DNA match, the Cas9 enzyme needs to recognize a specific nucleotide sequence called a protospacer adjacent motif (PAM) adjacent to the target DNA.
The most commonly used form of Cas9, derived from the bacteria Streptococcus pyogenes and known as SpCas9, recognizes PAM sequences in which any nucleotide is followed by two guanine DNA bases. This limits the DNA sequences that can be targeted using SpCas9 to only those that include two sequential guanines.
To get around this limitation, the team set up an engineering system that allowed them to rapidly evolve the ability of SpCas9 to recognize different PAM sequences. From a collection of randomly mutated SpCas9 variants, they identified combinations of mutations that enabled SpCas9 to recognize new PAM sequences.
These evolved variants essentially double the range of sites that can now be targeted for gene editing using SpCas9. They also identified an SpCas9 variant that was less likely to induce the off-target gene mutations sometimes produced by CRISPR-Cas9 nucleases, a problem originally described in a 2013 study led by J. Keith Joung, HMS professor of pathology at Mass General and senior author of the current study.
“This additional evolved variant with increased specificity should be immediately useful to all researchers who currently use wild-type SpCas9 and should reduce the frequencies of unwanted off-target mutations,” Joung said. “Perhaps more important, our findings provide the first demonstration that the activities of SpCas9 can be altered by directed protein evolution.”
The scientists showed in their paper that the forms of Cas9 found in two other bacteria—Staphylococcus aureus and Streptococcus thermophiles—can also function in their bacterial evolution system, suggesting that their functions can be modified as well, Joung said.
“This work just scratches the surface of the range of PAMs that can be targeted by Cas9,” Joung said. “We believe that other useful properties of the enzyme may be modified by a similar approach, allowing potential customization of many important features.”
The study was supported by National Institutes of Health Director’s Pioneer Award DP1 GM105378, NIH grants R01 GM107427 and R01 GM088040, a Jim and Ann Orr Research Scholar Award, and a National Sciences and Engineering Research Council of Canada Postdoctoral Fellowship. The MGH has filed a patent application on the use of the SpCas9 variants described in this paper.
Adapted from a Mass General news release.
Geneticists just got a new pair of glasses.
By improving an imaging technology called FISH, they’ve made it possible to view genetic material in more detail than ever before.
One modification achieves “super resolution” imaging. The other can distinguish maternal from paternal chromosomes.
What the researchers are starting to see promises to help them better understand how DNA gets packaged into chromosomes, how that structure relates to health and disease, what the biological significance may be when genetic material is inherited from one parent versus the other—and more.
“Geneticists have looked at chromosomes for over a century, but they’ve longed for higher resolution and an easier and more affordable strategy for looking at any part of any genome, not just the most accessible regions,” said Ting Wu, professor of genetics at Harvard Medical School and senior author of the study, published in Nature Communications. “With this new technology, we’ve enabled significant steps forward in all those regards.”
“Scientists have a lot of models where we draw in cartoon form what we think is happening,” said the study’s first author, Brian Beliveau, who conducted the work as a graduate student in the Wu lab. “When a region of the genome is silenced, we draw it as compacted. When a gene is expressed, we draw it as more open or active. But we don’t actually know yet what any of these look like.”
“Now we have a tool to start looking at this systematically, which is very exciting,” said Beliveau, currently a research fellow in the lab of HMS associate professor of systems biology Peng Yin at the Wyss Institute for Biologically Inspired Engineering.
Teach a person to FISH…
FISH, short for fluorescence in situ hybridization, is a decades-old visualization technique that locates and illuminates specific genes in the nuclei of cells.
Scientists use FISH to find out things like how many chromosomes a person has or where a particular gene is located on a chromosome.
But conventional FISH has its limits, and scientists want to look closer.
For instance, it hasn’t been able to show the details of how chromosomes are folded, whether there are different ways of folding or how two genome segments interact—all of which would help elucidate how our bodies work.
Researchers have had some success getting high-resolution FISH images of small segments of the genome. But the cost to look at such questions genome-wide has been “beyond anybody’s purse,” said Wu. “The flexibility to target a specific region on the genome has been very difficult also.”
In 2012, Beliveau devised an enhancement to FISH called Oligopaint that made this possible at low cost.
Traditional FISH works by attaching a fluorescent tag to a short, single strand of DNA called a probe, which has a sequence complementary to the segment of DNA a researcher wants to study. When released into a cell, the probe binds to the desired sequence in the nucleus and lights it up so it can be seen under a microscope.
A limiting factor to scaling up FISH has been that “it’s a ton of work to try to find those sequences in nature to make the necessary DNA probes and isolate them in a way that would be compatible with the tool,” said Beliveau.
He solved the problem by developing computer software that lets scientists design the probes they need and then build them with synthetic components—hundreds or thousands at a time.
In the new paper, Beliveau taught Oligopaint two new tricks.
First, he paired it up with two other technologies—STORM from the lab of Xiaowei Zhuang at Harvard University, and DNA-PAINT from Peng Yin’s lab—to zoom in on chromosomes at “super resolution.”
The DNA double helix is about 2 nanometers wide. When it gets wound around a bundle called a nucleosome, one of the simplest building blocks of chromosomes, it grows to about 10 nanometers. Since conventional microscopy can resolve images to only about 200 to 300 nanometers, it’s been impossible to see such infinitesimal structures.
The new Oligopaint combinations can bring chromosomes into focus down to about 20 nanometers. That’s possible because each DNA probe is able to bind to a second probe; the resulting fluorescent output blinks, enabling researchers to distinguish the fluorescent tags one at a time and achieve a finer resolution.
“This is an unprecedented level of detail,” said Wu.
“Right now when you do FISH on structures below a certain size, you get a spot,” said Beliveau. “With these technologies, we’re seeing interesting shapes and structures, loops and protrusions, start to drop out of these things.”
Wu, Beliveau and their colleagues have already begun to find “intriguing organizational themes” in the chromatin of mammalian and fruit fly cells. But they consider the current paper mostly a proof of concept.
“The goal is to communicate these techniques to the research community as quickly as possible so everyone can start making discoveries,” said Beliveau.
They hope researchers generate “a huge bolus of images” that will reveal more about what our genetic structure looks like, Wu added.
“Of course, naturally, if we get down to 20 nanometers then people will want 10 nanometers, and then 5,” she said wryly.
Distinguishing mom from dad
We normally inherit our chromosomes pairs: one from our mother and one from our father. The second trick Beliveau taught Oligopaint was to tell them apart.
“This was not thought possible but, to our surprise, our technology managed to break this barrier too,” said Wu.
Beliveau modified his Oligopaint probes so they could detect the presence or absence of the many single nucleotide variants, or SNPs, that distinguish maternal from paternal chromosomes.
As long as researchers know which SNPs to look for and have hundreds or thousands of them to target—just a few isn’t enough—Oligopaint can now light up a chromosome to identify “mom” or “dad.”
Wu, for one, is excited to use this new ability to study things like how one of two X chromosomes gets inactivated in female mammals. It’s also become clear to scientists that maternal and paternal genes don’t behave the same way, and that this might have effects on human health and disease. Wu is particularly interested in Down syndrome, in which children inherit three instead of two copies of chromosome 21.
“I’d love to look at a cell and ask, are two copies from the mother? From the father? How do they behave relative to each other?” she said.
This study was supported by the National Institutes of Health (R01GM61936, R01GM090278, 1R01EB018659, 5DP1GM106412, F32CA157188, 1DP2OD007292, 5R21HD072481, 1DP2OD004641), National Science Foundation (CCF1054898, CCF1162459), Office of Naval Research (N000141110914, N000141010827, N000141310593), Harvard Medical School, Wyss Institute for Biologically Inspired Engineering, Centre National de la Recherche Scientifique, Howard Hughes Medical Institute, Damon Runyon Cancer Research Foundation, Fulbright Visiting Scholar Program and Alexander von Humboldt-Foundation.
Imagine someone hands you a smoothie and asks you to identify everything that went into it.
You might be able to discern a hint of strawberry or the tang of yogurt. But overall it tastes like a blend of indiscernible ingredients.
Now imagine that the smoothie is made of 20,000 ground-up cells from, say, the brain.
You could run tests to determine what molecules are in the sample, which is what scientists do now. That would certainly give you useful information, but it wouldn’t tell you which cells those molecules originally came from. It would provide only an average cell profile for the whole smoothie.
And when it comes to the tissues in our bodies, averages are almost always misleading. Just as you know there isn’t an “average” food called strawbanaspinach-orangegurt, scientists know there isn’t just one cell type in the brain.
“If you take a hunk of tissue and grind it up and analyze the RNA, you have no idea if it represents what every cell in that population is doing or what no cell in the population is doing,” said Marc Kirschner, the John Franklin Enders University Professor of Systems Biology and chair of the Department of Systems Biology at Harvard Medical School. “Imagine if you had a population of men and women. If you assume everyone is an average of men and women, you [probably] wouldn’t represent a single person in that population.”
The trouble is, it’s expensive, time-consuming and tricky to characterize tissues one cell, or cell type, at a time.
Kirschner and Steven McCarroll, assistant professor of genetics at HMS, reported this week in separate papers that their labs have developed high-throughput techniques to quickly, easily and inexpensively give every cell in a sample a unique genetic barcode before it goes into the blender.
As a result, scientists can analyze complex tissues by profiling each individual cell—no averaging required.
“Different cells in a tissue use the same genome in amazingly diverse ways: to engineer specialized cell shapes, accomplish diverse feats of physiology, and mount distinct functional responses to the same stimulus. These techniques will finally let science understand how biological systems operate at that single-cell level,” said McCarroll, who is also director of genetics for the Stanley Center for Psychiatric Research at the Broad Institute of Harvard and MIT. “We are so excited about the work ahead.”
To make their tools, both teams collaborated with David Weitz, the Mallinckrodt Professor of Physics and Applied Physics at Harvard’s School of Engineering and Applied Sciences and a pioneer in the field of microfluidics.
The teams expect that their techniques, published concurrently in the journal Cell, will equip biologists to discover and classify cell types in the body in much greater depth, map cell diversity in complex tissues such as the brain, better understand stem cell differentiation and gain more insights into the genetics of disease.
Harvard’s Office of Technology Development has been working closely with the researchers to develop patent applications for various aspects of the technology, all with an eye toward commercialization.
‘Two roads diverged in a yellow wood’
Unbeknownst to each other, they decided to develop methods to answer the same question: How could they obtain gene expression profiles for thousands of individual cells to better understand the complexity of gene expression within a tissue?
Gene expression—the pattern of gene activity in a particular cell—underlies every process in biology, from cognition in the brain to development in the egg. Scientists have known for 50 years that gene expression varies from cell to cell like a fingerprint, making skin cells different from liver cells and making some liver cells different from others. But they haven’t been able to measure it efficiently at the single-cell level in samples with many cell types.
Macosko, HMS instructor in psychiatry at Massachusetts General Hospital and a Stanley Neuroscience Fellow in the McCarroll lab, came up with a technique he called Drop-seq. Klein, assistant professor of systems biology at HMS, devised a method he called indexing droplets for sequencing, or inDrops.
Last fall, they learned about each other’s work through the scientific conference circuit.
“It was kind of like meeting your doppelgänger,” said Macosko. “He had been thinking about the same things I had for two years. Human beings have different ways of solving problems, and it was really cool to see how he did it.”
How they work
The teams each developed ways of using tiny beads to deliver vast numbers of different DNA barcodes into hundreds of thousands of nanometer-sized water droplets simultaneously.
Thanks to Weitz’s expertise, both methods were able to use microfluidic devices to co-encapsulate cells in these droplets along with the beads. The droplets get created in a tiny assembly line, streaming along a channel the width of a human hair.
The bead barcodes get attached to the genes in each cell, so that scientists can sequence the genes all in one batch and still trace each gene back to the cell it came from.
Macosko and Klein make their beads in different ways. The droplets get broken up at different steps in the process. Other aspects of the chemistry diverge. But the result is the same.
After running a single batch of cells through Drop-seq or inDrops, scientists “can see which genes are expressed in the entire sample—and can sort by each individual cell,” said Klein.
They can then use computer software to uncover patterns in the mix, including which cells have similar gene expression profiles. That provides a way to classify what cell types were in the original tissue—and to possibly discover new ones.
Current methods allow researchers to generate 96 single-cell expression profiles in a day for several thousand dollars. Drop-seq, by comparison, enables 10,000 profiles a day for 6.5 cents each.
“If you’re a biologist with an interesting question in mind, this approach could shine a light on the problem without bankrupting you,” said Macosko. “It finally makes gene expression profiling on a cell-by-cell level tractable and accessible. I think it’s something biologists in a lot of fields will want to use.”
Rather than competing with each other, the teams believe that having two options available in Drop-seq and inDrops will benefit the scientific community.
“Each method has unique elements that makes it better for different applications. Biologists will be able to choose which one is most appropriate for them,” said Macosko.
McCarroll, Macosko and their colleagues are excited to explore the brain with Drop-seq.
With luck, that will include discovering new cell types, constructing a global architecture of those cell types in the brain and understanding brain development and function as they relate to disease.
Among the questions they want to pursue are: What are all the cell types that make the brain work? How do these cell types vary in their functions and responses to stimuli? What cell populations are missing or malfunctioning in schizophrenia, autism and other disorders of the brain?
Classifying cell types may not sound exciting, said Joshua Sanes, the Jeff C. Tarr Professor of Molecular and Cellular Biology and the Paul J. Finnegan Family Director of the Center for Brain Science at Harvard University and a co-author of the Drop-seq paper, but it lays the foundation for mapping neuronal circuits and one day being able to probe the mystery of how the “wetware” of the brain gives rise to thoughts, emotions and behaviors.
In the shorter term, Sanes looks forward to completing a catalog of cell types in the mouse retina. Drop-seq has already revealed several new ones.
Kirschner, Klein and their colleagues, meanwhile, are keenly interested in other areas, including stem cell development.
“Does a population of cells that we initially think is uniform actually have some substructure?” Klein wants to know; he’s trying to find out by studying immune cells and different kinds of adult stem cells. “What is the nature of an early developing stem cell? What endows those cells with a pluripotent state? Is gene expression more plastic or does it have a well-defined state that’s different from a more mature cell? How is its fate determined?”
Using inDrops, Klein and team have confirmed prior findings that suggest even embryonic stem cells are not uniform. They found previously undiscovered cell types in the population they studied, as well as cells in intermediate stages that they suspect are converting from one type to another.
Although both teams are excited by the massive amounts of data they and other researchers will obtain from Drop-seq and inDrops, they realize the sheer volume of information poses a problem as well.
“We have thousands of cells expressing tens of thousands of genes. We can’t look in 20,000 directions to pick out interesting features,” said Klein.
Machine learning is able to do some of that, and the teams have already employed new statistical techniques. Still, Kirschner has called on mathematicians and computer scientists to develop new ideas about how to analyze and extract useful information about our biology from the mountains of data that are on the horizon.
Financial disclosures and funding information
Allon Klein, Linas Mazutis, Ilke Akartuna, David Weitz and Mark Kirschner have submitted patent applications (US62/065,348, US62/066,188, US62/072,944) for the work described.
A patent application has also been filed for the work described by Macosko et al.
The Kirschner lab’s study was supported by the National Institutes of Health (SCAP Grant R21DK098818), a Career Award at the Scientific Interface from the Burroughs-Wellcome Fund, and a Marie Curie International Outgoing Fellowship (300121).
The McCarroll lab’s work was supported by the Stanley Center for Psychiatric Research, the Simons Foundation, the National Institutes of Health (P50HG006193, U01MH105960, R25MH094612, F32HD075541), the Klarman Cell Observatory, a Stewart Trust Fellows Award and the Howard Hughes Medical Institute.
Microfluidic device fabrication was performed at the Harvard Center for Nanoscale Systems, a member of the National Nanotechnology Infrastructure Network, with support from the National Science Foundation and the Harvard Materials Research Science and Engineering Center.
Tiny hair cells in the inner ear play an outsized role.
For balance, five separate patches of hair cells sense movement and tell the brain where the head is in space while translating the pull of gravity.
For hearing, a five cell-wide ribbon of 16,000 hair cells spirals inside the cochlea, the snail-shaped structure where hair cells vibrate in response to sound waves. Every cycle of sound waves sends microscopic cilia on the tips of these cells back and forth, riding a trampoline of cells suspended between two fluid-filled spaces.
The movement opens pores in the cells, allowing electrical current to flow inside. This conversion of mechanical to electrical signals sends nerve impulses to the brain, which then “hears” the sound.
David Corey, the Bertarelli Professor of Translational Medical Science at Harvard Medical School and a Howard Hughes Medical Institute investigator, has spent his scientific life studying this mechanosensory apparatus, asking which proteins are involved in converting sounds into nerve impulses. So far, only about one-third of those proteins are known. Déborah Scheffer, HMS research associate in the Corey lab, is interested in what makes hair cells different from the cells that surround them in the inner ear.
Their most recent work, published in The Journal of Neuroscience, has revealed that many genes implicated in hereditary deafness are much more active in hair cells than in surrounding cells. This suggests that other genes that produce proteins only in hair cells might also cause inherited deafness. About 1 in 1,000 children are born deaf, and mutations in as many as 300 different genes might cause deafness.
Their findings may also have implications for age-related hearing loss, which affects about half of adults aged 75 or older. Sometimes impairment occurs much earlier, after exposure to harmful amounts of noise.
“This work gives us a parts list that the hair cell uses to assemble different components, helping us figure out the molecular mechanism of sensing sound,” Corey said. “It also tells us which genes drive the unique development of a hair cell, raising the hope that this information can be used to create new hair cells to restore hearing in cases of age- or noise-related hearing loss.”
To understand the hair cells of the inner ear, Corey and Scheffer collaborated with colleagues Jun Shen, HMS instructor in pathology at Brigham and Women’s Hospital, and Zheng-Yi Chen, associate professor of otolaryngology at Massachusetts Eye and Ear, to find out which genes hair cells use that neighboring cells do not. What makes one cell different from another is the choice and the timing of the genes expressed by a cell.
Working in mice engineered to make green fluorescent hair cells, Scheffer devised a way to separately purify hair cells and surrounding cells at different points over about two weeks of mouse inner-ear development. At each point and for each sample, she sequenced the RNA used to make proteins for all 20,000 genes in the mouse genome.
“Now we have a panel of all the genes that are involved in hair cell development,” Scheffer said.
Scheffer and Corey deposited their data in a publicly available database that Shen established three years ago. The Shared Harvard Inner Ear Laboratory Database, or SHIELD, holds gene expression data integrated with comprehensive annotation, including potential locations for deafness genes. Scientists from around the world access the data more than 400 times a day.
Some scientists interested in the molecular biology of hearing and deafness can use SHIELD to identify new deafness genes, which may lead to specific gene therapies. Others want to know what makes a hair cell a hair cell, so they can find a way to make surrounding cells in the inner ear turn into hair cells. These cells do not normally divide, so once they are lost, the only hope is to somehow induce them to divide or to turn neighboring cells into hair cells.
Next, Scheffer will explore microRNA expression in hair cells to see which genes they regulate. That will yield a genetic network of gene expression, messenger RNA and protein production.
“I want to know all the genes that interact with each other and what transcription factors are involved at each step,” she said.
Corey said their work is only the foundation.
“Someday this work will go to the clinic, but first you have to know the parts list.”
This research was supported by National Institutes of Health grants R01-DC000304, R01-DC002281, R03-DC013866 and R01-DC006908; the Frederick and Ines Yeatts Hair Cell Regeneration Grant; P30 DC05209 to the Eaton-Peabody Laboratory of Massachusetts Eye and Ear and a Hearing Health Foundation Emerging Research Grant.
Pseudogenes, a subclass of long noncoding RNA (lncRNA) that developed from the human genome’s 20,000 protein-coding genes but has lost the ability to produce proteins, have long been considered nothing more than genomic “junk.”
Yet the retention of these 20,000 mysterious remnants during evolution suggests that they may in fact possess biological functions and contribute to the development of disease.
Now, a team led by investigators at Harvard Medical School and the Cancer Center at Beth Israel Deaconess Medical Center has provided some of the first evidence that one of these noncoding “evolutionary relics” actually has a role in causing cancer.
In a new study published in the journal Cell on April 2, the scientists report that, independent of any other mutations, abnormal amounts of the BRAF pseudogene led to the development of an aggressive lymphoma-like disease in a mouse model, a discovery suggesting that pseudogenes may play a primary role in a variety of diseases.
The new discovery also suggests that with the addition of this vast “dark matter” the functional genome could be tremendously larger than previously thought—three or four times its current known size.
“Our mouse model of the BRAF pseudogene developed cancer as rapidly and aggressively as it would if you were to express the protein-coding BRAF oncogene,” explained senior author Pier Paolo Pandolfi, the HMS George C. Reisman Professor of Medicine and co-founder of the Institute for RNA Medicine in the Cancer Center at Beth Israel Deaconess.
“It’s remarkable that this very aggressive phenotype, resembling human diffuse large B-cell lymphoma, was driven by a piece of so-called ‘junk RNA,'" he said.
"In the past, we have found noncoding RNA to be overexpressed, or misexpressed, but because no one knew what to do with this information, it was swept under the carpet. Now we can see that it plays a vital role. We have to study this material, we have to sequence it and we have to take advantage of the tremendous opportunity that it offers for cancer therapy,” Pandolfi said.
Competing endogenous RNAs
The new discovery hinges on the concept of competing endogenous RNAs (ceRNA), a functional capability for pseudogenes first described by Pandolfi almost five years ago when his laboratory discovered that pseudogenes and other noncoding RNAs could act as decoys to divert and sequester tiny pieces of RNA, known as microRNAs, away from their protein-coding counterparts to regulate gene expression.
In this new paper, the authors wanted to determine whether this same ceRNA “cross talk” took place in a living organism and whether it would result in similar consequences.
“We conducted a proof-of-principle experiment using the BRAF pseudogene,” explained first author Florian Karreth, who conducted this work as a postdoctoral fellow in the Pandolfi laboratory.
“We investigated whether this pseudogene exerts critical functions in the context of a whole organism and whether its disruption contributes to the development of disease,” Karreth said.
The investigators focused on the BRAF pseudogene because of its potential ability to regulate the levels of the BRAF protein, a well-known proto-oncogene linked to numerous types of cancer.
In addition, said Karreth, the BRAF pseudogene is known to exist in both humans and mice.
The investigators began by testing the BRAF pseudogene in tissue culture. Their findings demonstrated that when overexpressed, the pseudogene did indeed operate as a microRNA decoy that increased the amounts of the BRAF protein.
This, in turn, stimulated the MAP-kinase signaling cascade, a pathway through which the BRAF protein controls cell proliferation, differentiation and survival and which is commonly found to be hyperactive in cancer.
Aggressive lymphoma-like cancer
When the team went on to create a mouse model in which the BRAF pseudogene was overexpressed, they found that the mice developed an aggressive lymphoma-like cancer.
Similar to their findings in their cell culture experiments, the investigators found that the mice overexpressing the BRAF pseudogene displayed higher levels of the BRAF protein and hyperactivation of the MAP kinase pathway, which suggests that this axis is indeed critical to cancer development.
They confirmed this by inhibiting the MAP kinase pathway with a drug that dramatically reduced the ability of cancer cells to infiltrate the liver in transplantation experiments.
The Pandolfi team further validated the microRNA decoy function of the BRAF pseudogene by creating two additional transgenic mice, one overexpressing the front half of the BRAF pseudogene and the other overexpressing the back half.
Both of these mouse models developed the same lymphoma phenotype as the mice overexpressing the full-length pseudogene, a result that the authors described as “absolutely astonishing.”
The investigators also found that the BRAF pseudogene is overexpressed in human B-cell lymphomas and that the genomic region containing the BRAF pseudogene is commonly amplified in a variety of human cancers. Moreover, the authors said, silencing the BRAF pseudogene in human cancer cell lines that expressed higher levels led to reduced cell proliferation, a finding that highlights the importance of the pseudogene in these cancers and suggests that a therapy that reduces BRAF pseudogene levels may be beneficial in cancer patients.
This work was supported, in part, by the National Institutes of Health (CA170158-01), the Department of Defense Prostate Cancer Research Program, the American Cancer Society, the German National Academy of Sciences Leopoldina, the Italian Association for Cancer Research, the International Association for Cancer Research, Cancer Research UK and the Wellcome Trust.
Adapted from a Beth Israel Deaconess news release.
The need for a better understanding of personal genetics has never been more urgent. That was the message an expert panel of speakers relayed in a Congressional briefing on the intersection of personal genetics and law enforcement.
“There is no time to lose,” said Lauren Tomaselli, director of curriculum and training for the Personal Genetics Education Project (pgEd) at Harvard Medical School, citing a recent appeal to the Supreme Court on a ruling that allows a person’s DNA to be collected and tested without their knowledge or permission. The case was declined by the court.
The mission of pgED is to educate young people through school programs and to accelerate public awareness of genetics issues by advising the entertainment industry. It also seeks to engage lawmakers—the “eyes and ears of the nation”— in discussions.
pgEd takes no position on policy, preferring to educate from a neutral position so that its audience can make better-informed decisions.
At the briefing, Duana Fullwiley, associate professor of anthropology at Stanford University, said in some cases genetic technologies that are being utilized by the U.S. criminal justice system are leapfrogging not just public understanding but also peer-reviewed scientific evaluation.
One case in point: DNA phenotyping, a tool that generates the image of a human face based on genetic samples that have been taken from a crime scene.
Police in Columbia, South Carolina, recently relied on such an image provided by Parabon NanoLabs as they searched for suspects in a double murder.
The science behind this service, called Snapshot, has not been analyzed by people outside the company, Fullwiley said.
She criticized the database on which Snapshot is based for two reasons, saying it skews toward an over-representation of African-Americans and its results offer a false sense of precision.
Focusing on a single type of suspect can implicate a whole group, she said, citing the generic image of a young man with dark hair, eyes and skin.
“When, as a society, we are already dealing with racial bias in policing and civil rights, we have to be very careful about rolling out technologies that can potentially have racial impacts that are disparate for different groups,” she said.
David Kaye, associate dean for research at Penn State Law, said DNA screening in criminal investigations is often racially based because it relies on witness accounts.
He asked, “If you use the information at your disposal, is it truly discriminatory?”
Courts have also allowed involuntary collection of genetic samples, even through subterfuge, he said.
For example, he said detectives duped a suspect into replying to a letter that offered money via a class-action suit. DNA that was recovered from the paper form was returned to a fabricated law office created by the detectives. In another instance, he said a case was built against a serial killer based on DNA retrieved from his daughter’s Pap smear.
The Microbe Question
Claire Fraser, director of the Institute for Genome Sciences at the University of Maryland School of Medicine, explained how microbial DNA might one day be used for forensic purposes. Her past work identified genetic mutations in anthrax spores in the deadly 2001 anthrax mailing. That laid the foundation for the new field of microbial forensics.
“Mother Nature is the best bioterrorist,” she said, using SARS, Ebola and West Nile virus as examples.
The microbes we carry with us, collectively known as our microbiomes, could potentially be used as identifiers, she said, but added that that day is far in the future.
Henry Greely, director of the Center for Law and the Biosciences at Stanford Law School, said he worries about the ethnic disproportion in the database of 11 million records now held by federal and state law enforcement.
“There is a much higher chance for a black American than a white American” to be implicated by a family member’s DNA sample, he said. “That’s troubling.”
While it would be politically difficult, Greely said, he would prefer to see a system in which all Americans would have their samples included in a federal database, making it more representative of the nation. He did concede that privacy could be a problem. If privacy were breached, he said, public trust in law enforcement and in genetics would suffer.
Genetic Privacy Rights
Slaughter is a longtime champion of genetic privacy, having sponsored a bill that in 2000 became the Genetic Information Nondiscrimination Act, also known as GINA. She was introduced at the briefing as “the only microbiologist in Congress.”
“GINA was all about privacy,” she said, recalling the battle for its passage. “We wanted to make sure that the social policy kept up with science, but science fiction intervened. Everybody thought we were talking about cloning.”
Protecting genetic information in the workplace and for insurance purposes is still an urgent issue, Slaughter said.
“Your genetics belongs to you and the information is yours,” she said to applause from the audience, which included congressional staffers as well as people from the U.S. Department of Justice, the FBI, the National Institutes of Health, the American Society of Human Genetics and the American Association for the Advancement of Science.
In the discussion that followed the speakers’ presentations, Ting Wu, HMS professor of genetics and a founder of pgEd, asked if somehow racial discrimination could be minimized.
“Obviously it’s a problem,” she said. “We can think of Ferguson and see where that goes."
Wu, who founded pgEd in 2006, said she feels a deep responsibility to educate people about genetics. She has said it’s not a choice but a necessity.
In an interview after the briefing, George Church, the Robert Winthrop Professor of Genetics at HMS, raised the issue of “DNA exceptionalism,” in which genetic tools are seen as different from other modalities, and not just in jurisprudence.
In medicine, for example, gene therapy is viewed as an extraordinary category of treatment.
The pace of public understanding and scientific advancement are not moving in step, he said.
“We have a long way to go, but that’s because genetics is a moving target,” Church said.
Samantha Schilit, a pgEd affiliate and a graduate student in genetics, said she hopes to pursue personal genetics as a genetic counselor. She attended the briefing as a guest of pgEd after winning a contest in her department to add the most pins to Map-Ed, an online quiz on key concepts and topics in genetics.
“What shocked me is how truly new these topics are,” she said, citing the DNA phenotyping news from South Carolina in February.
Schilit said she is uneasy about the possibility that information gathered by a direct-to-consumer company, for example, could find its way into a forensic investigation, a possibility that was raised by Greely.
“These issues are ethically complicated,” she said. “This field is moving so fast.”
The briefing was the third of five planned by pgEd. The first briefing highlighted the science of genomics, personalized medicine and genetic engineering as well as ways to reach out to the public. The second briefing focused on two topics: the role of genetics research in the unfolding Ebola outbreak in West Africa and the issues addressed by GINA. The third briefing on law enforcement grew out of topics touched on in the first two.
pgEd is supported by the HMS Department of Genetics and private funding from Sigma-Aldrich, Autodesk, Genentech, IDT (targeted specifically for GETed conferences and Map-Ed), and an anonymous donor.
An analysis of genetic and lifestyle data from 10 large epidemiologic studies has confirmed that regular use of aspirin or other nonsteroidal anti-inflammatory drugs (NSAIDs) appears to reduce the risk of colorectal cancer in most individuals.
The study, published in JAMA, also found that a few individuals with rare genetic variants do not share this benefit. Additional questions need to be answered before preventive treatment with these medications can be recommended for anyone, the study authors cautioned.
“Previous studies, including randomized trials, demonstrated that NSAIDS, particularly aspirin, protect against the development of colorectal cancer, but it remains unclear whether an individual’s genetic makeup might influence that benefit,” said Andrew Chan, HMS associate professor of medicine at Massachusetts General Hospital and co-senior author of the JAMA report. “Since these drugs are known to have serious side effects—especially gastrointestinal bleeding—determining whether certain subsets of the population might not benefit is important for our ability to tailor recommendations for individual patients.”
The research team analyzed data from the Colon Cancer Family Registry and from nine studies included in the Genetics and Epidemiology of Colorectal Cancer Consortium, which includes the Nurses’ Health Study, the Health Professionals Follow-up Study and the Women’s Health Initiative. They compared genetic data for 8,624 individuals who developed colorectal cancer with genetic data for 8,553 individuals who did not, matched for factors such as age and gender.
The comprehensive information on lifestyle and general health data provided by participants in the studies again confirmed that regular use of aspirin or NSAIDs was associated with a 30 percent reduction in colorectal cancer risk for most individuals. However, that preventive benefit did not apply to everyone. The study found no risk reduction in participants with relatively uncommon variants in genes on chromosome 12 and chromosome 15.
“Determining whether an individual should adopt this preventive strategy is complicated, and currently the decision needs to balance one’s personal risk for cancer against concerns about internal bleeding and other side effects,” Chan said. “This study suggests that adding information about one’s genetic profile might help in making that decision. However, it is premature to recommend genetic screening to guide clinical care, since our findings need to be validated in other populations. An equally important question that also needs to be investigated is whether there are genetic influences on the likelihood that someone might be harmed by treatment with aspirin and NSAIDs.”
Support for this study includes several grants from the National Cancer Institute and the National Institute of Diabetes and Digestive and Kidney Diseases.
Adapted from a Mass General news release.
Scientists have known for 15 years that mutations in a single gene lead to Rett syndrome, a severe neurological disorder that affects girls around their first birthdays. In the years since the MECP2 gene was pinpointed, researchers have struggled to understand how it functions in the brain in Rett syndrome.
Now the enigma of Rett syndrome and perhaps other disorders on the autism spectrum could be one step closer to being solved.
A Harvard Medical School team has discovered that when MECP2 is mutated in Rett syndrome, the brain loses its ability to regulate genes that are unusually long. Their finding suggests new ways to consider reversing the intellectual and physical debilitation this disruption causes with a drug that could potentially target this error. The team, led by Michael Greenberg, reported its findings in Nature.
“The longer the gene, the more disrupted it becomes when you lose MECP2,” said Greenberg, the Nathan Marsh Pusey Professor of Neurobiology at HMS. “Rett syndrome may be a defect in this process of fine-tuning the expression of long genes.”
Scientists, including Greenberg, have figured out over the last 10 years that MECP2 plays a role in sculpting the connections between neurons in the developing brain. These synapses are refined by exposure to sensory experiences, just the sort of stimulation a one-year-old would encounter as she learns to walk and talk.
MECP2 is present in all cells in the body, but when the brain is forming and maturing its synapses in response to sensory input, MECP2 levels in the brain are almost 10 times as high as in other parts of the body. The new study connects MECP2 mutations to long genes, which may be more prone to errors simply because their length leaves more room for mistakes.
“Normally, MECP2 may act like a speed bump, fine-tuning long genes by slowing down the machinery that transcribes long genes,” said Harrison Gabel, a postdoctoral fellow in the Greenberg lab and co-first author of the Nature paper. In transcription, the information in a strand of DNA is copied onto a new molecule of messenger RNA, which is then turned into a protein. “Without MECP2, the machinery may be moving too fast, making too much mRNA from these genes, resulting in problems for the neurons.”
Finding this effect of MECP2 on long genes was no small feat. In a typical search for the mechanism behind a genetic mutation, mice are engineered to lack the normal gene so that its absence reveals how it functions. However, work in many different labs has shown that knocking out MECP2 had only subtle effects when analyzed across the genome. The changes in gene expression were inconsistent, small and, using Gabel’s word, “fuzzy.”
Gabel took another approach, querying massive genomic databases such as ENCODE to ask a simple question: What do genes that are affected by mutated MECP2 have in common?
Answer: They are long. Most of them are at least five times longer than the average gene, with many of them more than 50 times longer than the average. It is important to note that the genes identified across dozens of data sets were very long, giving the researchers a common finding where previous conclusions from these data sets had lacked a common theme.
Harrison and co-first author Benyam Kinde, an MD-PhD student in the Greenberg lab, found the long-gene misregulation in multiple mouse models of Rett syndrome and confirmed it in the brain tissue of deceased Rett patients.
For MECP2 to function normally as a speed bump, it binds to a form of methylated DNA found in long genes in the brain. Methyl groups are chemical modifiers of gene activity, and in other parts of the body MECP2 binds methylated CG sites on genes. The methylation pattern that appears to be important for MECP2 in regulating long genes is known as methylated CA, and there appears to be a special mechanism operating as synapses are forming.
“It seems that evolution has used MECP2 and methylated CA to put in place this speed bump so that the expression of long genes is restrained in the brain,” Greenberg said. “As far as Rett syndrome, the thought is now that this subtle but widespread overexpression of long genes might be contributing to the disorder.”
The scientists can’t be sure of what these overexpressed long genes do, but many of them appear to be very important to the function of the brain. This suggests that if they could correct the defect in long-gene expression, they might be able to reverse at least some of the symptoms of Rett syndrome. As a first attempt at a corrective strategy, the researchers selected a cancer drug called topotecan because it blocks an enzyme known to be important for long-gene transcription.
In a lab dish, they added topotecan to neurons lacking MECP2. The drug reversed the long-gene misregulation, suggesting that restoring normal long-gene expression might be a way to correct neurological dysfunction in Rett syndrome and in other autism spectrum disorders with long genes, such as fragile X syndrome. Topotecan, a chemotherapeutic agent, is too toxic, Greenberg said, but derivatives of topotecan might be a worthwhile avenue to pursue.
“We think this issue of long-gene misregulation may be more generally occurring in other disorders of human cognition,” Greenberg said. “The potential is pretty significant because one now has a common regulatory mechanism to target with drugs.”
This work was supported by grants from the Rett Syndrome Research Trust and the National Institutes of Health (1RO1NS048276 and T32GM007753), the Damon Runyon Cancer Research Foundation (DRG-2048-10), the William Randolf Hearst fund and the Howard Hughes Medical Institute.
Researchers have begun to appreciate the importance of copy number variation when considering the connections between DNA and disease.
Most people have two copies of most genes. But some have only one copy, or three, or none. There have been hints that copy number variation (CNV) might range much more widely than zero to three, but such extremes have been hard to analyze in gene sequencing data.
“For all the excitement about copy number variation in human genetics, most earlier research has been limited to the simplest form of CNV, in which you have either a missing segment or an extra copy of it,” said Steven McCarroll, assistant professor of genetics at Harvard Medical School and director of genetics for the Stanley Center for Psychiatric Research at the Broad Institute of MIT and Harvard.
“Here we came up with a way to analyze extreme forms of CNV,” he said. “Now we can start to use this exuberant form of genetic variation to help illuminate the genetic basis of disease.”
McCarroll and colleagues reported their insights about extreme CNV in Nature Genetics on Jan. 26. Their discoveries were made possible by new computational techniques that first author Bob Handsaker developed to analyze whole-genome sequence data from thousands of genomes at once.
“Before, we had no good way to study genes that have a really high copy number, above four,” said Handsaker, a research scientist in the McCarroll lab. “Now we can find places where people’s gene copy number ranges from zero to 15. It’s the first time we’ve been able to measure this kind of variation with such precision.”
“We’ve found that in hundreds of genes, there’s a wide variation in copy numbers. Now that we can measure these variations accurately, we can ask whether there are health repercussions,” said Handsaker.
The results also enrich the understanding of human genome evolution, said McCarroll.
Once they had developed a way to study extreme CNV, Handsaker, McCarroll and their team made four primary discoveries.
First: About 88 percent of gene copy number variation among humans arises from extreme copy number variants rather than simple copy number variants.
“These extreme copy number variants are a small fraction of all CNVs, but they have broader effects on genes than we anticipated,” said McCarroll.
Second: The more copies of a gene a person has, the more that gene is expressed.
“You might think this was obvious,” said Handsaker, “but in some organisms, such as plants, when you have more copies, most of them are turned off. It turns out that in humans, they’re all turned on in almost all cases.”
Third: With simple CNV, most people have two copies, while a few outliers have one or three or none. McCarroll’s team found that with extreme CNV, most people don’t have two copies but instead have CNVs scattered across a wide range.
“For a lot of these CNVs with these especially exuberant differences, two randomly chosen people are actually more likely to have different numbers of copies than the same number,” said Handsaker.
Fourth: Sequences with more copies are more likely to mutate further, expanding in copy number quickly and dramatically.
The team found what they call “runaway duplication haplotypes,” in which some versions of a chromosome have acquired as many as 10 copies of a gene over the past thousand or so generations, while other versions of the same chromosome continue to have just one copy.
“The fast, dramatic expansion in copy number of specific genes appears to have been evolutionarily recent and geographically localized,” said McCarroll.
One gene involved in resistance to trypanosomes—parasites that cause human illnesses including sleeping sickness and Chagas disease—evolved to have a high copy number on a subset of the chromosomes in West African populations. Another gene, related to a gene that contributes to asthma resistance, evolved to have a high copy number in Europe.
“These variations show really unusual patterns in some parts of the world,” said McCarroll. “But it’s too soon to know whether they’re doing something important.”
The team is now offering to the research community “the first data resource on extreme forms of CNV and how they actually vary across a large number of people” as well as a software toolkit to analyze extreme CNV in huge sequencing data sets, McCarroll said.
“Until recently, whole-genome sequencing was quite expensive. Today, that’s changing quickly,” McCarroll added. “This work gives us a sense of the kinds of things it’s going to be possible to see in whole-genome sequences that it wasn’t possible to see before.”
Coauthor Jennifer R. Berman is an employee of Bio-Rad Inc.
This research was supported by National Human Genome Research Institute grant R01 HG006855. Additional funding from NHGRI (U01 HG006510) is supporting follow-on work to develop production-ready software that can be used by any research laboratory.