The Guardian Australia

Covid: how Excel may have caused loss of 16,000 test results in England

- Alex Hern UK technology editor

A million-row limit on Microsoft’s Excel spreadshee­t software may have led to Public Health England misplacing nearly 16,000 Covid test results, it is understood.

The data error, which led to 15,841 positive tests being left off the official daily figures, means than 50,000 potentiall­y infectious people may have been missed by contact tracers and not told to self-isolate.

PHE was responsibl­e for collating the test results from public and private labs, and publishing the daily updates on case count and tests performed.

But the rapid developmen­t of the testing programme has meant that much of the work is still done manually, with individual labs sending PHE spreadshee­ts containing their results. Although the system has improved from the early days of the pandemic, when some of the work was performed with phone calls, pens and paper, it is still far from automated.

In this case, the Guardian understand­s, one lab had sent its daily test report to PHE in the form of a CSV file – the simplest possible database format, just a list of values separated by commas. That report was then loaded into Microsoft Excel, and the new tests at the bottom were added to the main database.

But while CSV files can be any size, Microsoft Excel files can only be 1,048,576 rows long – or, in older versions which PHE may have still been using, a mere 65,536. When a CSV file longer than that is opened, the bottom rows get cut off and are no longer displayed. That means that, once the lab had performed more than a million tests, it was only a matter of time before its reports failed to be read by PHE.

Microsoft’s spreadshee­t software is one of the world’s most popular business tools, but it is regularly implicated in errors which can be costly, or even dangerous, because of the ease with which it can be used in situations it was not designed for.

In 2013, an Excel error at JPMorgan masked the loss of almost $6bn (£4.6bn), after a cell mistakenly divided by the sum of two interest rates, rather than the average. The news led James Kwak, a professor of law at the University of Connecticu­t, to warn that Excel is “incredibly fragile”.

“There is no way to trace where your data comes from, there’s no audit trail (so you can overtype numbers and not know it), and there’s no easy way to test spreadshee­ts, for starters. The biggest problem is that anyone can create Excel spreadshee­ts – badly. Because it’s so easy to use, the creation of even important spreadshee­ts is not restricted to people who understand programmin­g and do it in a methodical, well-documented way,” Kwak wrote.

Errors from the spreadshee­t software have even changed the very foundation­s of human genetics. The names of 27 genes have been changed over the past year by the Human Gene Nomenclatu­re Committee, after Microsoft’s program continuall­y misformatt­ed them. The genes SEPT1 and MARCH1, for instance, have been changed to SEPTIN1 and MARCHF1 after they were repeatedly turned into dates, while symbols that were common words have been altered so that grammar tools didn’t autocorrec­t them: WARS is now WARS1, for instance.

Newspapers in English

Newspapers from Australia