— Genetic mugshots
A private-vs-public debate may define the future of health care
“ORDER. SPIT. DISCOVER.” That’s a winning tagline for the DNA testing kit ‘23andme’, a product so popular it has made the top five of Amazon’s best-selling products.
It’s also an approach ushering in a new era of personalised medicine. In theory, having your DNA read from a simple saliva sample could be like gazing into a medical crystal ball. It will list the diseases you are predisposed to, help you prevent them or guide you to the best treatment if you already have them.
For instance, let’s say you’re a woman and your test shows you carry the deadly version of the breast cancer gene BRCA1. You might have a mastectomy, as Angelina Jolie did. Or not. Not every BRCA1 mutation is equally deadly, and it also depends on what other genes you carry.
Big data is required to truly realise the vision of personalised medicine. We need to contribute our genetic information and medical histories to databases, whose daunting complexity researchers try to decode – increasingly with the aid of machine-learning algorithms.
The deal breaker is genetic privacy. Those databases must be hack-proof if we want to prevent unscrupulous insurance companies or employers from lifting a person’s genetic secrets. The current standard is to de-identify genetic and medical information so there are no linked names or other clues that might be crossreferenced to trace identity.
In September 2017 a startling paper in the journal PNAS suggested genetic privacy could no longer be guaranteed. The authors were from Human Longevity Inc. (HLI), led by its founder Craig Venter, who was one of the scientists famous for reading the human genome in 2001.
HLI claimed it deployed a very smart machine-learning algorithm to reconstruct a person’s face, with 80% accuracy, from a tract of their genetic code. Yes, really, a genetic mugshot. Throw in facial recognition software and the ubiquitous Facebook profile and HLI might have a dealt a coup de gras to genetic privacy.
The result was mind-bending, the scientific blowback swift.
Within days, Yaniv Erlich, a Columbia University computational biologist, challenged the predictive power of the HLI algorithm. Former HLI employee Jason Piper, a co-author on the paper, went further by distancing himself from its conclusions on Twitter and accusing Venter of crafting a result aimed at keeping genetic data in private hands.
Piper’s logic: highlighting the potential to put a face to the genes in a public database may lead to the proposition that guaranteed anonymity requires keeping DNA in the shrouded servers of private companies, such as HLI, which stand to gain plenty.
How believable is the science? While perhaps not ready for centre stage, it would appear to be hovering in the wings.
“Can DNA from a scene-of-crime semen sample give a picture of what a rapist looks like? As of now, probably not; as of three years from now, probably yes,” says Bob Williamson, honorary professor at Melbourne’s Murdoch Children’s Research Institute.
The obvious concern is that the rise and rise of machine learning could ring the death knell for genetic data sharing. However, science is racing to find a solution in what is akin to a technological arms race.
US researchers recently showed a technique called “genome cloaking” that can conceal most of the genetic code, revealing only a small subset of interest to researchers. The true game changer, though, could be block chain encryption. Sydney’s Garvan institute, for one, has signed tech startup E-nome to secure more than 15,000 patient DNA data sets with the technology.
The stakes are high. In recent months, Australian authorities have published no fewer than three reports about the benefits of precision medicine. The US National Institutes of Health aims to sequence a million genomes by 2020.
But the real action may well lie to our north; China has funded its own precision medicine juggernaut to the tune of US$9 billion. It will team public and private sectors and, on one estimate, sequence 100 million genomes by 2030.
With that vast database, China could nail many of precision medicine’s problems – such as the spectrum of cancer risk from the BRCA1 gene – at least for its own population.
Will China share its intelligence with the rest of us? How will the Chinese program navigate the public-private debate? As it all plays out, the world’s scientists and ethicists will be paying very close attention.
Machine learning could ring the death knell for genetic data sharing.