Scientists write book with DNA code
Scientists have used DNA to encode the contents of a book. At 53,000 words, and including 11 images and a computer program, it is the largest amount of data yet stored artificially using the genetic material.
The researchers claim that the cost of DNA coding is dropping so quickly that within five to 10 years it could be cheaper to store information using this method than in conventional digital devices.
Deoxyribonucleic acid or DNA – the chemical that stores genetic instructions in almost all known organisms – has an impressive data capacity. One gram can store up to 455bn gigabytes: the contents of more than 100bn DVDs, making it the ultimate in compact storage media.
A team led by Professor George Church of Harvard Medical School has demonstrated that the technology to store data in DNA, while still slow, is becoming more practical. They report in the journal Science that the 5.27 megabit collection of data they stored is more than 600 times bigger than the largest dataset previously encoded this way.
Writing the data to DNA took several days. “This is currently something for archival storage,” explained co-author Dr Sriram Kosuri of Harvard’s Wyss Institute, “but the timing is continually improving.”
DNA has many advantages over traditional digital storage media. It can be easily copied, and is often still readable after thousands of years in non-ideal conditions. Unlike ever-changing electronic storage formats such as magnetic tape and DVDs, the fundamental techniques required to read and write DNA information are as old as life on Earth.
The researchers, who have filed a provisional patent application covering the idea, used off-the-
shelf components to demonstrate their technique.
Digital data is traditionally stored as binary code: ones and zeros. Although DNA offers the ability to use four “numbers”: A, C, G and T, to minimise errors Church’s team decided to stick with binary encoding, with A and C both indicating zero, and G and T representing one.
The sequence of the artificial DNA was built up letter by letter with the string of As, Cs, Ts and Gs coding for the letters of the book.
The team developed a system in which an inkjet printer embeds short fragments of that artificially synthesised DNA onto a glass chip. Each DNA fragment also contains a digital address code that denotes its location within the original file.
The fragments on the chip can later be “read” using standard techniques of the sort used to decipher the sequence of ancient DNA found in archaeological material. A computer can then reassemble the original file in the right order using the address codes.
The book – an HTML draft of a volume co-authored by the team leader – was written to the DNA with images embedded to demonstrate the storage medium’s versatility.