Ancestry sites could soon expose nearly anyone’s identity, researchers say
Genetic testing has helped plenty of people gain insight into their ancestry, and some services even help users find their long-lost relatives. But a new study published this week in Science suggests that the information uploaded to these services can be used to figure out your identity, regardless of whether you volunteered your DNA in the first place.
Earlier this year, Sacramento police arrested 72-year-old Joseph James DeAngelo for a wave of rapes and murders allegedly committed by DeAngelo in the 1970s and 1980s. And they claimed to have identified DeAngelo with the help of genealogy databases.
Traditional forensic investigation relies on matching certain snippets of DNA, called short tandem repeats, to a potential suspect. But these snippets only allow police to identify a person or their close relatives in a heavily regulated database. Thanks to new technology, the investigators in the Golden State Killer case isolated the genetic material that’s now collected by consumer genetic testing companies from the suspected killer’s DNA left behind at a crime scene.
This information, coupled with other historical records, such as newspaper obituaries, helped investigators create a family tree of the suspect’s ancestors and other relatives. After zeroing in on potential suspects, including DeAngelo, the investigators collected a fresh DNA sample from DeAngelo—one that matched the crime scene DNA perfectly.
But while the detective work used to uncover DeAngelo’s alleged crimes was certainly clever, some experts in genetic privacy have been worried about the grander implications of this method.
Erlich and his team wanted to see how easy it would be in general to use the method to find someone’s identity by relying on the DNA of distant and possibly unknown family members. So they looked at more than 1.2 million anonymous people who had gotten testing from MyHerit- age, and specifically excluded anyone who had immediate family members also in the database.
They found that more than half of these people had distant relatives — meaning third cousins or further — who could be spotted in their searches. For people of European descent, who made up 75 percent of the sample, the hit rate was closer to 60 percent. And for about 15 percent of the total sample, the authors were also able to find a second cousin.
Much like the Golden State investigators, the team found they could trace back someone’s identity in the database with relative ease by using these distant relatives and other demographic but not overly specific information, such as the target’s age or possible state residence.
The researchers say it will take only about 2 percent of an adult population having their DNA profiled in a database before it becomes theoretically possible to trace any person’s distant relatives — and therefore, to uncover their identity.