Popular Mechanics (USA)
How America’s Top Supercomputer Can Beat COVID-19 for Good
Are 200 quadrillion operations per second enough to beat the virus?
MMarti Head, Ph.D., had a bad feeling. It was mid-February of 2020. She’d just returned to her home in Tennessee from a work trip. Somewhere, probably in an airport, she’d picked up what she thought was just a cold.
Sure, physically she felt crappy. But the bad feeling, which she described as “itchy,” came from the news coming out of China. A career spent working with infectious diseases had given Head all the info she needed on what, exactly, the novel coronavirus might be capable of.
And so, when her nose started running and her throat got scratchy, Head quarantined herself and her husband. Instead of watching trashy TV in bed to recover, she tucked into her quarantine cocoon of a home office—with tissues and tea at hand—and started hunting.
Head is a drug hunter. A computational chemist by training, Head uses complex computer simulations to search for molecules that can gum up the gears of a virus hell-bent on infecting human cells. She focuses on therapeutics—the things doctors rely on to treat disease. Head spent decades at a major drug company searching for drugs that would combat diseases, including viruses like HIV. But in February of 2020 she was working at the Oak Ridge National Laboratory, in Oak Ridge, Tennessee. Moving to public sector meant Head had an obligation to find something, anything, that might serve the public good in this time of crisis. It also meant that she had access to one of the most powerful supercomputers in the world.
While the most visible war against COVID-19 was being fought in ICUs and emergency rooms across the country, another line of defense was assembling in Tennessee. The nation’s brightest computer scientists were pressing Summit, a 200-petaflop supercomputer, into service as a Covidfighting machine. The only problem? They had to keep the machine—which demands constant monitoring from a crew of on-site technicians—running, despite a global pandemic.
SOMEONE HAS TO do it, Bronson Messer, Ph.D., told himself in January of 2020 as he decided to return to his old job at Oak Ridge. Messer is a computational astrophysicist, who prefers to spend his days noodling out questions like, Where did all the uranium in the universe come from? From 2010 to 2011, Messer had pushed aside his own research while he served as the Director of Science at the Oak Ridge Leadership Computing Facility. It’s a job that required, in large part, helping other researchers achieve their data goals on Summit at the expense of his own work. Messer says he missed the research life and stepped away. But in late 2019 the job opened again. Messer knew how to do it and allowed himself to be pulled back in. What he did not know, though, was just how chaotic—and high-stakes—this role would become in two months.
On March 22, 2020, President Trump announced the formation of the COVID-19 High Performance Computing Consortium, an effort to pair researchers working on Covid-related solutions with time on 16 of the nation’s supercomputers, including Summit. Within days, Messer had on his desk a pile of projects—like finding how Covid attacks the body, and searches for drugs that might save lives. These researchers needed time on Summit. It was up to Messer to make sure the brightest minds and the worthiest projects rose to the top of the pile. By April, three to four of his workdays per week were spent just allocating time to researchers requesting a turn on the machine. you later, supernovas, he thought.
As proposals crashed into Messer’s inbox, Paul Abston was trying to figure out just how, if the pandemic got as bad
as predicted, he would keep Summit running.
Abston is the group leader for infrastructure and operations at the Oak Ridge Leadership
Computing Facility. Keeping the lights on and the computer working is his responsibility.
With the formation of the High Performance
Computing Consortium, Summit was designated as critical infrastructure—right alongside America’s power grids and water pipes—and effectively ordered to stay online.
Every employee who worked on Summit was now essential. Come a power outage, or water leak, or Covid outbreak in the facility,
Abston was going to have to find a way to keep it humming.
Humming is not hyperbole. Supercomputers are more like a high-powered telescope than a laptop. Summit is a collection of 9,468
CPUs (central processing units, the processing systems your home computer runs on) and 27,756 GPUs (graphics processing unit, what your gaming systems run on). They’re stored in refrigerator-sized cabinets, lined up in rows like recruits at boot camp, standing shoulder-to-shoulder and ready to take orders. Inside each cabinet are 18 nodes, or drawers. Each node contains two CPUs and six GPUs. One hundred eighty-five miles of high-speed cable connects all those CPUs and GPUs. Pipes jut in and out of the ceiling, bringing water to cool down the cabinets, which burn up to 13 megawatts an hour—enough energy to power more than 10,000 homes. Stepping into the building that houses Summit is an auditory experience, like standing next to the ocean.
Calculations can be pushed through Summit remotely. But Summit is a machine, things break. At least weekly there’s a communication issue or a storage failure where someone’s work doesn’t get saved, Abston says. And those are just software problems. The 4,000 gallons of water running through the room to cool the machine could become a nightmare if a pipe sprung a leak. So, too, could a cyber attack. Or an attack on the power grid. There was a lot that Abston was tasked with protecting—and a bevy of workers he’d need to do it. At least some would have to be on site, which meant Abston needed to do everything in his power to stop a Covid outbreak before it could even begin.
First he reviewed exactly how few people he could get away with having in the building at a time. Then he reviewed his employees’ workstations. If someone had tasks that had to be performed in a tight space, he’d try to move them to shifts where they worked alone. Then he considered testing. Thankfully, Oak Ridge National Laboratory got its own testing facility up and running at the beginning of the pandemic.
And it would be needed. By April, Tennessee was seeing hundreds of new cases every day. By fall, they were in the thousands. Over the winter, the situation was dire, peaking at more than 10,000 new cases and more than 100 deaths daily. Still: Abston kept the lights on. He juggled schedules as colleagues needed to quarantine after they were exposed by a spouse or child. Sometimes he just came in and filled the gap. But, whatever happened, Abston simply could not allow an outbreak of Covid among his staff to stop Summit’s steady march toward progress.
THERE’S A STORY that Ray Smith likes to tell about how 60,000 acres of Appalachian farmland became a secret hub for American science. In 1939, Albert Einstein wrote to President Roosevelt warning of fission chain reactions utilizing uranium that could likely produce large amounts of power, and his belief that Germany was pursuing the research. “They were worried that they were going to build a bomb,” says Smith, historian for the City of Oak Ridge.
Roosevelt knew America needed to act. Smith says that
Roosevelt went to Senator Kenneth McKellar, then head of the Senate Appropriations Committee. “He said, ‘Senator, I need to put a large amount of money against the war effort. And I can’t let the press or anyone know how much it is or what it’s being used for. Can you help me with that?’”
The good senator from Tennessee responded that he could help with that—and where in Tennessee was it going to go?
By 1943, Clinton Engineer Works, which would later become the Oak Ridge National Laboratory, was up and running under the Manhattan Project to produce weaponsgrade plutonium. Scientists from around the country were soon reporting for duty in a town which was too new to exist on maps.
After the war, Oak Ridge continued to be a hub for science. In recent years, it’s become known for its hosting America’s most powerful supercomputers. Traditionally, we’ve talked about supercomputers mostly by how fast they can do calculations. That term is a “FLOP,” or a floating point operation, says Jeff Nichols, the associate laboratory director for computing and computational sciences at Oak Ridge. A floating point operation is just an addition or a multiplication, and when we rate supercomputers, we add up how many operations they can do per second. A million per second, that’s a megaflop. A billion is a gigaflop, and a trillion is a teraflop.
Summit is a 200-petaflop machine, meaning it can do 200 quadrillion operations per second. But back around 2009, supercomputer builders were stymied by how to continue to expand FLOP capacity without making these machines into monster energy guzzlers. The supercomputer then at Oak Ridge, named Jaguar, used up to about 8.2 megawatts per hour, says Nichols. “We knew that if we were going to double the computing, we were going to double the power, and we couldn’t do that anymore,” he says.
Looking for a solution, supercomputer designers wondered if they could use gaming processors to boost their energy efficiency. GPUs can be 10 times as powerful as CPUs, says Nichols. The problem, however, is that they were not as accurate. If Superman’s foot doesn’t quite hit the edge of the building when he jumps, our imaginations can close that gap. If a supercomputer misses a calculation when doing crucial drug research, that supercomputer is useless.
Nichols says that the team building Summit approached NVIDIA, a Santa Clara, California–based GPU manufacturer, and asked if it could build a GPU with the accuracy of a CPU. By altering the type of silicon used in the chip, NVIDIA was able to pull it off: They created a GPU that was both efficient with power and accurate with its calculations. The first supercomputer at Oak Ridge to be built with GPUs was named Titan. It was 10 times more powerful than Jaguar. In 2017, Titan was replaced by Summit, which was, again, 10 times more powerful than its predecessor.
Of course, power is good, but it’s not the only thing that matters. What researchers like Head and Dan Jacobson, Ph.D., really need is a smart supercomputer. Artificial intelligence—Summit’s biggest advantage over Titan— allows supercomputer users to build a model, and then tell the machine to look for patterns that might be like that model. Without this machine learning, you can only send a computer off to look for exact matches. That doesn’t help when you’re seeking molecules that may dock up with a virus. If there’s no exact match, your search will come up empty, when really, something that might have been close enough to work was overlooked. And machine learning allows researchers to be extremely specific in what results they do and don’t want returned to them. If the computer isn’t giving you what you want, you can teach it to do better.
Thanks to a special type of processor core, called a tensor core, Summit became both extremely fast and a quick study when it came to machine learning. Tensor cores allow computers to group and compare related data to identify connections and see how they interact. A normal core knocks out operations as they come, but a tensor core can also compare that operation to another that it’s been told is related.
A RANGE OF scientists from all over the country applied and got time on Summit for Covid-related projects. But perhaps two of the most important queries on the computer attacked the virus from opposite ends of the scientific spectrum. One wanted to know how Covid attacked the body, so we could better understand the disease. The other wanted to discover how we could stop the virus in its tracks.
The lab’s own Jacobson was charged with writing the code that would get answers on exactly why Covid was behaving in ways doctors had never seen before. Jacobson is a computational biologist. His work is specifically in systems biology, which involves deciphering the interconnected complexity of living organisms at the cellular level—whether that’s in plants destined for biofuels or in the human brain, unwinding the causes of various neuropsychiatric conditions like Alzheimer’s and autism.
Jacobson was watching the pandemic well before the rest of us. Through another project, he had contacts working in the Beijing embassy when the first cases in Wuhan were reported. He instantly understood the trouble mankind might be in. “There were a few of those ruh roh moments, where we said, ‘Yeah, this could go really quite poorly,’” Jacobson says.
Jacobson looks for patterns in data that reveal what exactly is happening in the molecular relationships within and between cells. At first, there wasn’t much data to work with. But then, as so many scientists across the globe put
their other research on hold to work on Covid-related projects, it was like a firehose, and Jacobson wanted all of it. He approaches biology holistically, using huge amounts of data from all types of inquiries to look for patterns and interesting interactions between systems. When it came to Covid, he hoarded everything: gene expression information, immune system information, physiology data, genetics data, protein structural data, electronic health records, environmental data, microbiome data, and autopsy data. The goal was to look for patterns that changed when people became infected, were sick, and then recovered from Covid. Looking at everything all at once “allows us to find things that often are missed otherwise. If you’re just looking at one thing at a time, you’re taking a very traditional approach,” he says. And you may find that one thing you’re looking for, but “you’ll overlook important things because you’re focused very narrowly.”
Marti Head wanted her turn, too. Before joining the Oak Ridge National Laboratory, Head spent part of her two decades at pharmaceutical giant GlaxoSmithKline hunting for drugs that would attack bacteria. Fighting Covid was going to be markedly harder. “Bacteria are alive, so you can kill them. They fight back, but you can kill them,” she says. “Viruses aren’t really alive, and it’s much harder to kill something that’s not really alive.”
Instead of going for the kill, Head’s drug-hunting hopes rested on finding molecules that could, essentially, throw a wrench in how the virus worked. In one case, she and her colleagues started looking at the main protease, an enzyme that essentially cuts the protein chain found in a cell infected with Covid into little tiny protein bits that then go off and do the virus’s bidding. Head needed a molecule that was exactly the right size and right shape to dock with a small groove they’d identified on the main protease. Step one was writing an algorithm that would essentially
search for molecules that could possibly be the right size and shape to dock with the virus.
But it’s not just enough for the two parts to fit, says Head. “Proteins are not just sitting there waiting for us in a static way to do something. They’re constantly moving as part of what they are, and so we need to understand those motions.”
A SUPERCOMPUTER IS only as super as the people writing code for it. A misconception, says Messer, is that you log onto Summit and can simply click on programs that help you run your query. For the vast majority of calculations on Summit, someone has to write all the algorithms. Usually, that someone is actually a group of someones. The researcher writes some of the code, but Messer adds that the graduate students doing code development are the lifeblood of Summit.
What makes writing code for these projects hard is that there’s rarely a single answer you’re seeking. An if-then algorithm won’t work, because you don’t want just one answer. “When I run an astrophysics code, there’s no answer at the end,” says Messer. Instead, he watches as a stream of data is produced that might point him toward possible answers. “And then I have to climb inside all the data that are generated to be able to infer some scientific insight,” says Messer.
To crack exactly why Covid was making so many people so sick, Jacobson was going to have to crawl inside a whole mess of data, too.
Jacobson started at the beginning, focusing on how the virus hooks onto cells. This he already knew: Covid goes after the ACE2 protein, which isn’t a typical receptor for a virus to latch onto. When he began looking at data from other coronaviruses—like the ones that cause the common cold—he realized that many of them target proteins in the renin angiotensin system (RAS) as entry points into cells. The RAS is partially responsible for regulating blood pressure and fluid and electrolyte balance. Jacobson figured he’d start there.
Covid previously had seemed like purely a respiratory disease. So targeting the RAS was a little unexpected. His next step was to use Summit to evaluate gene expression in lung tissue samples from infected and uninfected patients. Summit went searching, plowing through 2.5 billion calculations. The analysis coughed up a trove of data on exactly how genes are normally regulated and how those regulatory patterns were dramatically altered by SARS-CoV-2 infection.
And then: “I had that eureka moment. Not many times in my career can I go back to a discovery where there was a single eureka moment,” says Jacobson. But it was right there in the data: Covid was causing a massive dysregulation in the RAS.
Back to Summit Jacobson went. Because of the computer’s massive computational abilities, Jacobson was able to see changes in many cellular functions—ranging from inflammatory and permeability responses, to hyaluronic acid synthesis and degradation, to electrolyte balance and coagulation, that connected in some way to the RAS. From that resulting data set, it became clear that something strange was happening at the intersection between the RAS and the kallikrein-kinin (bradykinin) system, which both play roles in inflammatory responses. “We then dived into the clinical literature of what happens when you dysregulate those systems,” he says. “You look at those predictive symptoms in different parts of the body and, wow, they match up really well with what’s
going on in Covid-19.”
This research helped reframe the discussion of Covid being as much a vascular disease as a respiratory one. Dysregulation of the bradykinin system can cause blood vessels to essentially leak—which could explain why doctors were seeing patients with so much fluid in their lungs. Thanks to Summit and Jacobson’s research, and that of similar groups, clinicians began thinking about whether Vitamin D, a known regulator for the RAS, might help some patients. While just going outside and standing in the sunshine certainly won’t prevent Covid, there is evidence that it could reduce the severity of infections.
Likewise, the bradykinin hypothesis brought icatibant, a drug that acts as a bradykinin B2 receptor antagonist, into clinical trials. Though these drugs are not a cure-all for Covid, the bradykinin hypothesis is helping doctors understand what they’re seeing.
While Jacobson was discovering what was causing severe disease, Head was working the other side of the equation, hunting for a drug to beat back that severity.
Drug hunting takes a lot of patience. While Head has numerous patents and has taken several molecules fairly far in the drug testing process, she had yet to find a molecule that got to market as an efficacious drug. So much can go wrong in the development process: Maybe the molecule only docks with the protein in the lab. Or maybe it works when injected into mice, but won’t survive the acid of a stomach when swallowed in capsule form.
“We need it to be that one-in-a-million,” she says, describing the odds of finding a molecule that does it all.
Thanks to Summit, Head has a lead on that one-ina-million. It’s called MCULE-5948770040, and it both binds and inhibits the main protease. In late March, she published a pre-print paper on her team’s finding. That research is currently undergoing peer review. New variants, meanwhile, have made her work even more important. So far, vaccines appear to be effective against the new variants, but should that change, therapeutics will again become a most precious tool in the fight against Covid. Highlighting the importance of the development of effective Covid drugs, in June, the Biden administration announced $3 billion in funding for drug development projects like Head’s.
But Head is thinking well beyond the variants, too. What she’s truly hoping to build is code that’s a starting point for fighting the next pandemic. Because there will be a next pandemic. “We want those platforms ready to go, so we can respond quickly to the next Zika, Ebola, influenza, and coronavirus,” she says. “When, heaven help us, SARS-CoV-3 comes along, as long as we have the will to stay invested and vigilant, we will have the data, the platforms, and the people around the globe who are going to respond.”