Arab Times

Experts work toward DNA data storage

Bid to help companies archive huge amounts of data for decades

-

NEW YORK, July 24, (AP): Her computer, Karin Strauss says, contains her “digital attic” — a place where she stores that published math paper she wrote in high school, and computer science schoolwork from college.

She’d like to preserve the stuff “as long as I live, at least,” says Strauss, 37. But computers must be replaced every few years, and each time she must copy the informatio­n over, “which is a little bit of a headache.”

It would be much better, she says, if she could store it in DNA — the stuff our genes are made of.

Strauss, who works at Microsoft Research in Redmond, Washington, is working to make that sci-fi fantasy a reality.

She and other scientists are not focused in finding ways to stow high school projects or snapshots or other things an average person might accumulate, at least for now. Rather, they aim to help companies and institutio­ns archive huge amounts of data for decades or centuries, at a time when the world is generating digital data faster than it can store it.

To understand her quest, it helps to know how companies, government­s and other institutio­ns store data now: For long-term storage it’s typically disks or a specialize­d kind of tape, wound up in cartridges about three inches on a side and less than an inch thick. A single cartridge containing about half a mile of tape can hold the equivalent of about 46 million books of 200 pages apiece, and three times that much if the data lends itself to being compressed.

A tape cartridge can store data for about 30 years under ideal conditions, says Matt Starr, chief technology officer of Spectra Logic, which sells datastorag­e devices. But a more practical limit is 10 to 15 years, he says.

It’s not that the data will disappear from the tape. A bigger problem is familiar to anybody who has come across an old eight-track tape or floppy disk and realized he no longer has a machine to play it. Technology moves on, and data can’t be retrieved if the means to read it is no longer available, Starr says.

So for that and other reasons, longterm archiving requires repeatedly copying the data to new technologi­es.

Sequences

Into this world comes the notion of DNA storage. DNA is by its essence an informatio­n-storing molecule; the genes we pass from generation to generation transmit the blueprints for creating the human body. That informatio­n is stored in strings of what’s often called the four-letter DNA code. That really refers to sequences of four building blocks — abbreviate­d as A, C, T and G — found in the DNA molecule. Specific sequences give the body directions for creating particular proteins.

Digital devices, on the other hand, store informatio­n in a two-letter code that produces strings of ones and zeroes. A capital “A,” for example, is 01000001.

Converting digital informatio­n to DNA involves translatin­g between the two codes. In one lab, for example, a capital A can become ATATG. The idea is once that transforma­tion is made, strings of DNA can be custommade to carry the new code, and hence the informatio­n that code contains.

One selling point is durability. Scientists can recover and read DNA sequences from fossils of Neandertha­ls and even older life forms. So as a storage medium, “it could last thousands and thousands of years,” says Luis Ceze of the University of Washington, who works with Microsoft on DNA data storage.

Advocates also stress that DNA crams informatio­n into very little space. Almost every cell of your body carries about six feet of it; that adds up to billions of miles in a single person. In terms of informatio­n storage, that compactnes­s could mean storing all the publicly accessible data on the internet in a space the size of a shoebox, Ceze says.

In fact, all the digital informatio­n in the world might be stored in a load of whitish, powdery DNA that fits in space the size of a large van, says Nick Goldman of the European Bioinforma­tics Institute in Hinxton, England.

What’s more, advocates say, DNA storage would avoid the problem of having to repeatedly copy stored informatio­n into new formats as the technology for reading it becomes outmoded.

“There’s always going to be someone in the business of making a DNA reader because of the health care applicatio­ns,” Goldman says. “It’s always something we’re going to want to do quickly and inexpensiv­ely.”

Code

Getting the informatio­n into DNA takes some doing. Once scientists have converted the digital code into the 4-letter DNA code, they have to custom-make DNA. For some recent research Strauss and Ceze worked on, that involved creating about 10 million short strings of DNA.

Twist Bioscience of San Francisco used a machine to create the strings letter by letter, like snapping together Lego pieces to build a tower. The machine can build up to 1.6 million strings at a time.

Each string carried just a fragment of informatio­n from a digital file, plus a chemical tag to indicate what file the informatio­n came from.

To read a file, scientists use the tags to assemble the relevant strings. A standard lab machine can then reveal the sequence of DNA letters in each string.

Nobody is talking about replacing hard drives in consumer computers with DNA. For one thing, it takes too long to read the stored informatio­n. That’s never going to be accomplish­ed in seconds, says Ewan Birney, who works on DNA storage with Goldman at the bioinforma­tics institute.

But for valuable material like corporate records in long-term storage, “if it’s worth it, you’ll wait,” says Goldman, who with Birney is talking to investors about setting up a company to offer DNA storage.

Sri Kosuri of the University of California Los Angeles, who has worked on DNA informatio­n storage but now largely moved on to other pursuits, says one challenge for making the technology practical is making it much cheaper.

Scientists custom-build fairly short strings DNA now for research, but scaling up enough to handle informatio­n storage in bulk would require a “mindboggli­ng” leap in output, Kosuri says. With current technology, that would be hugely expensive, he says.

George Church, a prominent Harvard genetics expert, agrees that cost is a big issue. But “I’m pretty optimistic it can be brought down” dramatical­ly in a decade or less, says Church, who is in the process of starting a company to offer DNA storage methods.

For all the interest in the topic, it’s worth noting that so far the amount of informatio­n that researcher­s have stored in DNA is relatively tiny.

Earlier this month, Microsoft announced that a team including Strauss and Ceze had stored a record 200 megabytes. The informatio­n included 100 books __ one, fittingly, was “Great Expectatio­ns” __ along with a brief video and many documents. But it was still less than 5 percent the capacity of an ordinary DVD.

Yet it’s about nine times the mark reported just last month by Church, who says the announceme­nt shows “how fast the field is moving.”

Meanwhile, people involved with archiving digital data say their field views DNA as a possibilit­y for the future, but not a cure-all.

“It’s a very interestin­g and promising approach to the storage problem, but the storage problem is really only a very small part of digital preservati­on,” says Cal Lee, a professor at the University of North Carolina’s School of Informatio­n and Library Science.

Newspapers in English

Newspapers from Kuwait