The Post

New AI tool searches genetic haystacks to find disease-causing variants

-

Scientists have developed a way to sift through millions of difference­s in a person’s genetic blueprint to detect those that threaten our health, and have tested the new tool on a biomedical database of more than 450,000 people in the United Kingdom, according to a series of papers published yesterday in the journal Science.

The research marks a critical step towards harnessing the full power of the genome for medicine, and it demonstrat­es a new way artificial intelligen­ce can be applied to problems in human health, experts said.

One problem that has frustrated doctors for years stems from the fact that although we are 99.6% similar at the level of DNA, each of us has an average of 4 million variants – sections of the genetic code where we differ from one another.

‘‘It has been extremely difficult to determine which ones cause disease and which ones don’t,’’ said

Kyle Farh, vice-president of artificial intelligen­ce at the San Diego-based biotechnol­ogy company Illumina.

Farh and an internatio­nal team of almost 100 researcher­s created an algorithm designed to help medicine clear up some of the uncertaint­y. ‘‘We are aiming to eliminate variants of unknown significan­ce, which is the main barrier to unlocking the value of genomic medicine,’’ Farh said.

Just as ChatGPT can learn how to predict human speech by having engineers feed it a wealth of text, the new algorithm has been trained to make medical prediction­s based on reading genomes.

The scientists built the algorithm, called PrimateAI-3D, using the genetic blueprints of 233 different primate species. This base brings into sharp relief the variants that can be tolerated by primates, including humans, and those that prove deadly. Scientists look for places where the sequence is the same from one primate to another, a clear sign that any change is disastrous.

‘‘It’s a brilliant idea. As soon as I read the paper, I sent it to my team and said: We’ve got to get on this,’’ said Stephen Kingsmore, president and chief executive of Rady Children’s Institute for Genomic Medicine, a facility based in San Diego that decodes the genomes of 1000 families a year for 90 hospitals across the United States.

Kingsmore said that in about one-quarter of cases, doctors sequence a patient’s genome only to find a variant with an unknown impact on health. ‘‘We’re doing them a great disservice,’’ he said. ‘‘Parents kind of throw up their hands and say: Does the child have a disease or not? and we can only say maybe.’’

Until now, hospitals examining genetic variants in their patients have often consulted a large archive called ClinVar. The new

PrimateAI-3D algorithm scans about 70 million genetic variants, a selection that is more than 1000 times as large as ClinVar, Farh said.

The 3D in the name refers to the three-dimensiona­l structure of proteins, a key factor in distinguis­hing which mutations will wreak havoc. Many diseases are caused by mutations that harm a protein or cause the body to make too much or too little of it.

It remains unclear how much of a difference the algorithm will make in the course of day-to-day medicine, ‘‘but they do show it outperform­s anything we have currently’’, said Bruce Gelb, director of the Mindich Child Health and Developmen­t Institute at Icahn School of Medicine at Mount Sinai.

Gelb, who was not part of the study team, said he had seen a previous version of the algorithm described in Nature Genetics in 2018. The earlier version was based on six species of non-human primates, as opposed to the 233 primate species in the new version. ‘‘That’s a very large increase, and gives it much more statistica­l power to find things,’’ Gelb said.

Matthew Lebo, who directs the Laboratory for Molecular Medicine at Mass General Brigham, said that PrimateAI-3D would not eliminate the problem of finding variants of unknown significan­ce, but it would help doctors to prioritise the variants they were investigat­ing for a specific disease.

The new tool should also help pharmaceut­ical companies in their search for new drugs. Clinical trials often fail because the gene scientists are targeting is ‘‘incorrect, and not relevant to disease’’, Farh said. ‘‘Using AI and genomics to select the right targets should significan­tly reduce the rate of late-stage clinical trial failures.’’

Illumina said it would make the new tool broadly available in future releases of its software products.

By testing the new algorithm on hundreds of thousands of patient genomes in UK Biobank, ‘‘we found that 97% of the general population carries a rare variant’’ that has some kind of significan­t effect on health, Farh said. Although the algorithm could not account for the influence of diet and environmen­tal factors, he explained, ‘‘we can basically predict people’s levels of cholestero­l and glucose, and hence their risks for cardiovasc­ular disease or diabetes, from the genome by predicting the effects of these variants’’.

 ?? EUREKALERT! ?? The PrimateAI-3D algorithm is trained on genomes from 233 primate species, including the Humboldt’s squirrel monkey, like the ones seen here in Mamiraua, Brazil.
EUREKALERT! The PrimateAI-3D algorithm is trained on genomes from 233 primate species, including the Humboldt’s squirrel monkey, like the ones seen here in Mamiraua, Brazil.
 ?? Stephen Kingsmore ??
Stephen Kingsmore

Newspapers in English

Newspapers from New Zealand