Advances in Artificial Intelligence top seven technologies to watch in 2024
FROM protein engineering and 3D printing to detection of deepfake media, here are seven areas of technology that the Nature journal will be watching in the year ahead. Deep learning for protein design
Two decades ago, David Baker at the University of Washington in Seattle, United States, and his colleagues achieved a landmark feat: they used computational tools to design an entirely new protein from scratch. ‘ Top7’ folded as predicted, but it was inert: it performed no meaningful biological functions. Today, de novo protein design has matured into a practical tool for generating made- to- order enzymes and other proteins. “It’s hugely empowering,” says Neil King, a biochemist at the University of Washington who collaborates with Baker’s team to design protein- based vaccines and vehicles for drug delivery. “Things that were impossible a year and a half ago — now you just do it.”
Much of that progress comes down to increasingly massive data sets that link protein sequence to structure. But sophisticated methods of deep learning, a form of artificial intelligence ( AI), have also been essential.
‘ Sequence based’ strategies use the large language models ( LLMS) that power tools such as the chatbot CHATGPT. By treating protein sequences like documents comprising polypeptide ‘ words’, these algorithms can discern the patterns that underlie the architectural playbook of real- world proteins. “They really learn the hidden grammar,” says Noelia Ferruz, a protein biochemist at the Molecular Biology Institute of Barcelona, Spain. In 2022, her team developed an algorithm called PROTGPT2 that consistently comes up with synthetic proteins that fold stably when produced in the laboratory1. Another tool co- developed by Ferruz, called ZYMCTRL, draws on sequence and functional data to design members of naturally occurring enzyme families.
CHATGPT? Maybe next year
Readers might detect a theme in this year’s technologies to watch: the outsized impact of deep- learning methods. But one such tool did not make the final cut: the much- hyped artificial- intelligence ( AI)powered chatbots. CHATGPT and its ilk seem poised to become part of many researchers’ daily routines and were feted as part of the 2023 Nature’s 10 round- up ( see go. nature. com/ 3trp7rg). Respondents to a Nature survey in September ( see go. nature. com/ 45232vd) cited CHATGPT as the most useful AI- based tool and were enthusiastic about its potential for coding, literature reviews and administrative tasks.
Such tools are also proving valuable from an equity perspective, helping those for whom English isn’t their first language to refine their prose and thereby ease their paths to publication and career growth. However, many of these applications represent labour- saving gains rather than transformations of the research process. Furthermore, Chatgpt’s persistent issuing of either misleading or fabricated responses was the leading concern of more than two- thirds of survey respondents. Although worth monitoring, these tools need time to mature and to establish their broader role in the scientific world.
Sequence- based approaches can build on and adapt existing protein features to form new frameworks, but they’re less effective for the bespoke design of structural elements or features, such as the ability to bind specific targets in a predictable fashion. ‘ Structure based’ approaches are better for this, and 2023 saw notable progress in this type of protein- design algorithm, too. Some of the most sophisticated of these use ‘ diffusion’ models, which also underlie image- generating tools such as DALL- E. These algorithms are initially trained to remove computer- generated noise from large numbers of real structures; by learning to discriminate realistic structural elements from noise, they gain the ability to form biologically plausible, user- defined structures.
Rfdiffusion software developed by Baker’s lab and the Chroma tool by Generate Biomedicines in Somerville, Massachusetts4, exploit this strategy to remarkable effect. For example, Baker’s team is using Rfdiffusion to engineer novel proteins that can form snug interfaces with targets of interest, yielding designs that “just conform perfectly to the surface,” Baker says. A newer ‘ all atom’ iteration of Rfdiffusion5 allows designers to computationally shape proteins around non- protein targets such as DNA, small molecules and even metal ions. The resulting versatility opens new horizons for engineered enzymes, transcriptional regulators, functional biomaterials and more. Deepfake detection
The explosion of publicly available generative AI algorithms has made it simple to synthesize convincing, but entirely artificial images, audio and video. The results can offer amusing distractions, but with multiple ongoing geopolitical conflicts and a US presidential election on the horizon, opportunities for weaponized media manipulation are rife.
Siwei Lyu, a computer scientist at the University at Buffalo in New York, says he’s seen numerous AI- generated ‘ deepfake’ images and audio related to the Israel– Hamas conflict, for instance. This is just the latest round in a high- stakes game of cat- andmouse in which AI users produce deceptive content and Lyu and other media- forensics specialists work to detect and intercept it.
AI and science: what 1,600 researchers think
One solution is for generative- AI developers to embed hidden signals in the models’ output, producing watermarks of AI- generated content. Other strategies focus on the content itself. Some manipulated videos, for instance, replace the facial features of one public figure with those of another, and new algorithms can recognize artefacts at the boundaries of the substituted features, says Lyu. The distinctive folds of a person’s outer ear can also reveal mismatches between a face and a head, whereas irregularities in the teeth can reveal edited lip- sync videos in which a person’s mouth was digitally manipulated to say something that the subject didn’t say. AI- generated photos also present a thorny challenge — and a moving target. In 2019, Luisa Verdoliva, a media- forensics specialist at University Federico II of Naples, Italy, helped to develop Faceforensics++, a tool for spotting faces manipulated by several widely used software packages6. But image- forensic methods are subject- and softwarespecific, and generalization is a challenge. “You cannot have one single universal detector — it’s very difficult,” she says.
And then there’s the challenge of implementation. The US Defense Advanced Research Projects Agency’s Semantic Forensics ( Semafor) programme has developed a useful toolbox for deepfake analysis, but, as reported in Nature ( see Nature 621, 676– 679; 2023) major socialmedia sites are not routinely employing it. Broadening the access to such tools could help to fuel uptake, and to this end Lyu’s team has developed the Deepfake- OMeter, a centralized public repository of algorithms that can analyse video content from different angles to sniff out deepfake content. Such resources will be helpful, but it is likely that the battle against AI- generated misinformation will persist for years to come.