A couple of years ago, physician-scientist Robert Plenge took a close look at the 10 rheumatoid arthritis risk genes proposed by studies in the last 30 years. The effect of only two of them withstood the test of reproducibility.
As of this month, a new generation of genetic studies by Plenge and his colleagues have more than doubled the short list of confirmed susceptibility genes for rheumatoid arthritis. The list now includes stretches of DNA on chromosomes 2, 6, and 9 with previously unknown roles in the disease.
The sudden wealth of genetic clues in the disease echoes similar windfalls from other recent genomewide association studies. The rate of gene discovery has steeply climbed for more than a dozen complex diseases and common traits, including type 2 diabetes, multiple sclerosis, blood lipids, age-related macular degeneration, Crohn’s disease, height, and more, in a rush of papers by researchers in the Harvard Medical community and elsewhere.
“This is the year of the genomewide association study,” said Plenge, an HMS instructor in medicine at Brigham and Women’s Hospital and research fellow in the lab of David Altshuler at the Broad Institute of MIT and Harvard. Plenge is a first author on a paper in the Nov. 4 online Nature Genetics with the Broad team and on two related studies in the Sept. 6 New England Journal of Medicine with collaborators from Sweden and New York. A validation study from a British group appears in the same issue of Nature Genetics.
For years, efforts to identify genes whose variants contribute to complex diseases had met with stunning failure, said Elizabeth Phimister, genetics editor at NEJM. “For the longest time, geneticists expended an enormous amount of effort with little yield,” she said. “The surge in identifying credible variants is gratifying.”
The new genomewide association studies are more likely to hold up over time, thanks to a convergence of better tools, analytical methods, and independent replication. The rheumatoid arthritis research by Plenge and his colleagues exemplifies many of these features.
In the Nature Genetics paper, for example, a protective allele parked on chromosome 6 between two genes, TNFAIP3 and OLIG3, has a one in a trillion probability of being due to chance, an extreme P value standard that has become the norm in these types of studies.
To get there, a team of clinicians and scientists tested 100,000 genome markers in blood samples of 397 people enrolled in the BWH Rheumatoid Arthritis Sequential Study (BRASS), a longitudinal cohort of patients in the hospital-based practice. With fewer than 1,000 people, the study was underpowered, Plenge said. Many of the initial hits were likely to be false positives. He tested the top 90 strongest associations in a Swedish population-based rheumatoid arthritis patient dataset and in a North American family-based rheumatoid arthritis dataset. Only one made the final cut.
Further assurance comes from another Nov. 4 Nature Genetics paper. The study followed up on the rheumatoid arthritis component of the largest gene association study so far, the Wellcome Trust Case Control Consortium of 200 U.K. researchers analyzing 500,000 genetic markers from 17,000 people; the study looked for gene associations in seven common diseases, including type 1 and 2 diabetes, hypertension, irritable bowel syndrome, bipolar syndrome, and cardiovascular disease.
“We know some of these results were false-positive associations,” said senior author Jane Worthington, a consortium principal investigator at the University of Manchester. “The vital second step is a validation study, even if the original studies are well powered.” The original study sorted the top gene association results into three tiers by their apparent significance. For the follow-up study, Worthington’s team chose the nine second-tier candidates and tested them against a different U.K. database of rheumatoid arthritis patients. The one unequivocal finding was another marker in the same section of chromosome 6 that independently increased risk of disease, compared to the protection marker found in Plenge’s study. Altogether, the papers show at least three different versions of the gene exist—one risky, one protective, and one with no apparent effect.
One NEJM paper shows how the DNA variants contribute to more than one autoimmune disease. In the study, Plenge and his colleagues started with candidate genes identified in a North American family study of rheumatoid arthritis, then applied the genetic association methods to 1,529 cases from a Swedish rheumatoid arthritis dataset and to another group of 1,039 people with systemic lupus erythematosus. A single version of STAT4 raised the risk of rheumatoid arthritis by 60 percent and more than doubled the risk for lupus, compared with no copies of the risky version, they report.
The other NEJM paper describes a new genetic locus associated with the TRAF1 and C5 genes. The results come from a genomewide analysis of more than 300,000 genetic markers in 1,522 cases of more severe disease from the Swedish and North American databases and verification in different cases from the same databases. Both NEJM studies were international collaborations with a Swedish group headed by Lars Klareskog and a New York group led by Peter Gregersen.
Researchers use the word gene to describe their findings, because it is easier to tell the story. But to be more precise, the gene association studies have narrowed the suspects to chromosome blocks that contain a couple of protein-coding genes and other elements likely to influence the timing and amount of gene activity. The specific molecular culprits probably lurk nearby and are not the markers themselves.
Further studies will determine the pivotal sequence variations and their effects in more detail. “These genes are identified by statistics,” said Cynthia Morton, the William Lambert Richardson professor of obstetrics, gynecology and reproductive biology and professor of pathology at HMS and BWH; she is also editor of the American Journal of Human Genetics. “Now we want to understand functionally how they relate to the disease or trait phenotype. Some of the variants are in genes. We’re lucky if they’re in regulatory regions, but many are in places where we don’t know what difference it should make there, such as in introns, the spacings within genes.”
In another twist, the new risk variants have little predictive value for disease in individuals. Rheumatoid arthritis, for example, affects about 1 percent of the population. None of the genes identified by Plenge and others raise the risk above 2 percent. That is well below the 5 percent risk of disease faced by a person whose sibling or parent suffers from rheumatoid arthritis.
The earliest clinical benefits will likely emerge from additional studies evaluating whether these genes predict a person’s response to medications—either benefits or side effects—and possibly from identifying new drug targets from better understanding of the biology of the disease, Plenge said.
More genes will emerge from higher powered studies using ever improving tools, Plenge continued, and as researchers work their way down the list of possibilities in validation studies. He is following up on his Nature Genetics paper with a scan of 650 cases in the growing BRASS registry using nearly 1 million genetic markers and adding a meta-analysis of all available genetic data from other studies.
The foundation of the studies is the International HapMap project and the 3.1 million distinct single-letter variations, or SNPs, it has catalogued so far to track variations among human genomes. The project and its tools allow researchers to test for common gene variations that contribute to common diseases. The end of the second phase was marked by papers in the Oct. 18 Nature by Mark Daly, HMS assistant professor of medicine at Massachusetts General Hospital, and his colleagues; Daly is credited with a critical early observation about the genome’s haplotype structure.
The HapMap is well into producing a denser genome map for phase 3, a level that might reveal the guilty SNPs in validation studies without further sequencing, said HapMap leader David Altshuler, HMS associate professor of genetics and of medicine at MGH, and Broad program director. It will remain a touchstone as scientists move toward full sequencing of the genome. The first efforts toward gathering the full sequences of many individual human genomes will likely start with the original 270 HapMap samples.
The genomewide association studies that feel like such a major achievement this year are merely the first step in using the genetics approach to unravel the complexities of common diseases, Altshuler said. “No one would have looked at these genes without taking an unbiased genomewide approach to find out where the action is,” he said. But now, researchers need to fully sequence those areas, find the functional variations signaled by the markers, figure out how they protect or promote diseases or traits, learn how they might be useful in diagnosis, and explore how to reverse their disease-causing effects.