Sci­en­tists write book with DNA code

The Guardian Weekly - - Science - Geraint Jones

Sci­en­tists have used DNA to en­code the con­tents of a book. At 53,000 words, and in­clud­ing 11 im­ages and a com­puter pro­gram, it is the largest amount of data yet stored ar­ti­fi­cially us­ing the ge­netic ma­te­rial.

The re­searchers claim that the cost of DNA cod­ing is drop­ping so quickly that within five to 10 years it could be cheaper to store in­for­ma­tion us­ing this method than in con­ven­tional dig­i­tal de­vices.

De­oxyri­bonu­cleic acid or DNA – the chem­i­cal that stores ge­netic in­struc­tions in al­most all known or­gan­isms – has an im­pres­sive data ca­pac­ity. One gram can store up to 455bn gi­ga­bytes: the con­tents of more than 100bn DVDs, mak­ing it the ul­ti­mate in com­pact stor­age me­dia.

A team led by Pro­fes­sor Ge­orge Church of Har­vard Med­i­cal School has demon­strated that the tech­nol­ogy to store data in DNA, while still slow, is be­com­ing more prac­ti­cal. They re­port in the jour­nal Sci­ence that the 5.27 megabit col­lec­tion of data they stored is more than 600 times big­ger than the largest dataset pre­vi­ously en­coded this way.

Writ­ing the data to DNA took sev­eral days. “This is cur­rently some­thing for archival stor­age,” ex­plained co-au­thor Dr Sri­ram Ko­suri of Har­vard’s Wyss In­sti­tute, “but the tim­ing is con­tin­u­ally im­prov­ing.”

DNA has many ad­van­tages over tra­di­tional dig­i­tal stor­age me­dia. It can be eas­ily copied, and is of­ten still read­able af­ter thou­sands of years in non-ideal con­di­tions. Un­like ever-chang­ing elec­tronic stor­age for­mats such as magnetic tape and DVDs, the fun­da­men­tal tech­niques re­quired to read and write DNA in­for­ma­tion are as old as life on Earth.

The re­searchers, who have filed a pro­vi­sional patent ap­pli­ca­tion cov­er­ing the idea, used off-the-

shelf com­po­nents to demon­strate their tech­nique.

Dig­i­tal data is tra­di­tion­ally stored as bi­nary code: ones and ze­ros. Al­though DNA of­fers the abil­ity to use four “num­bers”: A, C, G and T, to min­imise er­rors Church’s team de­cided to stick with bi­nary en­cod­ing, with A and C both in­di­cat­ing zero, and G and T rep­re­sent­ing one.

The se­quence of the ar­ti­fi­cial DNA was built up let­ter by let­ter with the string of As, Cs, Ts and Gs cod­ing for the let­ters of the book.

The team de­vel­oped a sys­tem in which an inkjet printer em­beds short frag­ments of that ar­ti­fi­cially syn­the­sised DNA onto a glass chip. Each DNA frag­ment also con­tains a dig­i­tal ad­dress code that de­notes its lo­ca­tion within the orig­i­nal file.

The frag­ments on the chip can later be “read” us­ing stan­dard tech­niques of the sort used to de­ci­pher the se­quence of an­cient DNA found in ar­chae­o­log­i­cal ma­te­rial. A com­puter can then re­assem­ble the orig­i­nal file in the right or­der us­ing the ad­dress codes.

The book – an HTML draft of a vol­ume co-au­thored by the team leader – was writ­ten to the DNA with im­ages em­bed­ded to demon­strate the stor­age medium’s ver­sa­til­ity.

Newspapers in English

Newspapers from UK

© PressReader. All rights reserved.