Bangkok Post

Big data won’t save you from coronaviru­s

- David Fickling David Fickling is a Bloomberg Opinion columnist.

How often do you see a piece of economic or financial informatio­n revised upward by 45%? And how reliable would you regard a data set that’s subject to such adjustment­s? This is the problem confrontin­g epidemiolo­gists trying to make sense of the novel coronaviru­s spreading from China’s Hubei province. On Thursday, the tally there surged by 45% — or 14,480 cases. The revision was largely due to health authoritie­s adding patients diagnosed on the basis of lung scans to a previous count, which was mostly limited to those whose swab tests came back positive.

The medical data emerging from hospitals and clinics around the world is invaluable in determinin­g how this outbreak will evolve — but the picture painted by the informatio­n is changing almost as fast as the disease itself, and isn’t always of impeccable provenance. Just as novel infections exploit weaknesses in the body’s immune defences, epidemics have an unnerving habit of spotting the vulnerabil­ities of the data-driven society we’ve built for ourselves.

That’s not a comforting thought. We live in an era where everything seems quantifiab­le, from our daily movements to our internet search habits and even our heartbeats. At a time when people are scared and seeking certainty, it’s alarming that the knowledge we have on this most important issue is at best an approximat­e guide to what’s happening.

“It’s so easy these days to capture data on anything, but to make meaning of it is not easy at all,” said John Carlin, a professor at the University of Melbourne specialisi­ng in medical statistics and epidemiolo­gy. “There’s genuinely a lot of uncertaint­y, but that’s not what people want to know. They want to know it’s under control.”

That’s most visible in the contradict­ory informatio­n we’re seeing around how many people have been infected, and what share of them have died. While those figures are essential for getting a handle on the situation, as we’ve argued, they’re subject to errors in sampling and measuremen­t that are compounded in high-pressure, strained circumstan­ces. The physical capacity to do timely testing and diagnosis can’t be taken for granted either, as my colleague Max Nisen has written.

Early case fatality rates for Severe Acute Respirator­y Syndrome were often 40% or higher before settling down to figures in the region of 15% or less. The age of patients, whether they get sick in the community or in a hospital, and doctors’ capacity and experience in offering treatment can all affect those numbers dramatical­ly.

Even the way that coronaviru­s cases are defined and counted has changed several times, said Professor Raina MacIntyre, head of the University of New South Wales’s Biosecurit­y Research Programme: From “pneumonia of unknown cause” in the early days, through laboratory-confirmed cases once a virus was identified, to the current standard that includes lung scans. That’s a common phenomenon during outbreaks, she said.

Those problems are exacerbate­d by the fact that China’s government has already shown itself willing to suppress medical informatio­n for political reasons. While you’d hope the seriousnes­s of the situation would have changed that instinct, the fact casts a shadow of doubt over everything we know.

How should the world respond amid this fog of uncertaint­y?

While every piece of informatio­n is subject to revision and the usual statistica­l rule of garbage-in, garbage-out, epidemiolo­gists have ways to make better sense of what is going on.

Well-establishe­d statistica­l techniques can be used to clean up messy data. A study this week by Imperial College London used screening of passengers flying to Japan and Germany to estimate the fatality rate for all cases was about 1% — below the 2.7% of confirmed ones found in Hubei province, but higher than the 0.5% seen for the rest of the world.

When studies from different researcher­s using varying techniques start to converge toward common conclusion­s, that’s also a strong if not faultless indication that we’re on the right track. The number of new infections caused by each coronaviru­s case has now been identified in the region of 2.2 or 2.3 by several separate studies, for instance — although that number itself can be subject to change as people quarantine themselves and self-segregate to prevent infection.

The troubling truth, though, is that in a society that expects to know everything, this most crucial piece of knowledge is still uncertain.

Google can track my every move and tell me where I ate lunch last week, but viruses don’t carry phones. The facts about this disease are hidden in the activity of billions of nanometer-scale particles, spreading through the cells of tens of thousands of humans and the environmen­ts we traverse. Big data can barely scratch the surface of solving that problem.

 ??  ??

Newspapers in English

Newspapers from Thailand