- Introduction to Clinical Research Training
- Medical Education
- United Kingdom Clinical Scholars Research Training
- Vanderbilt Hall
- Financial Aid
- Office of the Registrar
- Campus Planning and Facilities
- Ombuds Office
- Committee on Microbiological Safety
- Human Resources
- HMS Foundation Funds
- Office for Academic and Clinical Affairs
- Joint Committee on the Status of Women
- The Academy
- Global Health Research Core
- Global Clinical Scholars Research Training Program
- HMA Standing Committee on Animals
- Office of Research Compliance
- Global & Community Health
- Harvard Medical School Event Calendar
- Contact @HMS
- Office of Diversity RIA Program
- The Dean's Perspective
- Department of Pathology
- Harvard Mahoney Neuroscience Institute
- OHRA Home
- Office of Research Subject Protection
- Tools and Technology
- Alumni Association
- Cancer Biology & Therapeutics Program
- Celiac Program
- Department of Medicine
- HMS Community Values Initiative
- HMS Information Technology
- HMS TransMed Program
- Introduction to the Practice of American Medicine
- Office of Communications & External Relations
- Master of Medical Sciences In Clinical Investigation
- Office of Global Education
- Portugal Clinical Scholars Research Training Program
- Shenzhen-HMS Initiative in International Education
- South American Clinical Research Training
- test page
- Safety Quality Informatics and Leadership
- Human Resources
- Jobs @ HMS
- Dental Medicine
- Harvard University
- Contact us
Writing the Book in DNA
Although George Church’s next book doesn’t hit the shelves until Oct. 2, it has already passed an enviable benchmark: 70 billion copies—roughly triple the sum of the top 100 books of all time.
And they fit on your thumbnail.
That’s because Church, the Robert Winthrop Professor of Genetics at Harvard Medical School and a founding core faculty member of the Wyss Institute for Biomedical Engineering at Harvard University, and his team encoded the book, Regenesis: How Synthetic Biology Will Reinvent Nature and Ourselves, in DNA, which they then read and copied.
Biology’s databank, DNA has long tantalized researchers with its potential as a storage medium: fantastically dense, stable, energy efficient and proven to work over a timespan of some 3.5 billion years. While not the first project to demonstrate the potential of DNA storage, Church’s team married next-generation sequencing technology with a novel strategy to encode 1,000 times the largest amount of data previously stored in DNA.
The researchers used binary code to preserve the text, images and formatting of the book. While the scale is roughly what a 5 ¼-inch floppy disk once held, the density of the bits is nearly off the charts: 5.5 petabits, or 1 million gigabits, per cubic millimeter. “The information density and scale compare favorably with other experimental storage methods from biology and physics,” said Sri Kosuri, a senior scientist at the Wyss Institute and senior author on the paper. The team also included Yuan Gao, a former Wyss postdoc who is now an associate professor of biomedical engineering at Johns Hopkins University.
And where some experimental media—like quantum holography—require incredibly cold temperatures and tremendous energy, DNA is stable at room temperature. “You can drop it wherever you want, in the desert or your backyard, and it will be there 400,000 years later,” Church said.
Reading and writing in DNA is slower than in other media, however, which makes it better suited for archival storage of massive amounts of data, rather than for quick retrieval or data processing. “Imagine that you had really cheap video recorders everywhere,” Church said. “Just paint walls with video recorders. And for the most part they just record and no one ever goes to them. But if something really good or really bad happens you want to go and scrape the wall and see what you got. So something that’s molecular is so much more energy efficient and compact that you can consider applications that were impossible before.”
About four grams of DNA theoretically could store the digital data humankind creates in one year.
Although other projects have encoded data in the DNA of living bacteria, the Church team used commercial DNA microchips to create standalone DNA. “We purposefully avoided living cells,” Church said. “In an organism, your message is a tiny fraction of the whole cell, so there’s a lot of wasted space. But more importantly, almost as soon as a DNA goes into a cell, if that DNA doesn’t earn its keep, if it isn’t evolutionarily advantageous, the cell will start mutating it, and eventually the cell will completely delete it.”
In another departure, the team rejected so-called “shotgun sequencing,” which reassembles long DNA sequences by identifying overlaps in short strands. Instead, they took their cue from information technology, and encoded the book in 96-bit data blocks, each with a 19-bit address to guide reassembly. Including jpeg images and HTML formatting, the code for the book required 54,898 of these data blocks, each a unique DNA sequence. “We wanted to illustrate how the modern world is really full of zeroes and ones, not As through Zs alone,” Kosuri said.
The team discussed including a DNA copy with each print edition of Regenesis. But in the book, Church and his co-author, the science writer Ed Regis, argue for careful supervision of synthetic biology and the policing of its products and tools. Practicing what they preach, the authors decided against a DNA insert—at least until there has been far more discussion of the safety, security and ethics of using DNA this way. “Maybe the next book,” Church said.
This work was supported by the U.S. Office of Naval Research (N000141010144), Agilent Technologies and the Wyss Institute.