The Guardian

Web shutdown was caused by single user

- Alex Hern Technology editor

An internet blackout that knocked out some of the world’s biggest websites on Tuesday was ultimately caused by a single customer innocently updating their settings, the infrastruc­ture provider Fastly has revealed.

A bug in Fastly’s code introduced in mid-May had lain dormant until Tuesday morning, according to Nick Rockwell, the company’s head of engineerin­g and infrastruc­ture.

When the unidentifi­ed customer updated their settings, it triggered the flaw, which ultimately took down 85% of the company’s network.

“On 12 May, we began a software deployment that introduced a bug that could be triggered by a specific customer configurat­ion under specific circumstan­ces,” Rockwell said. “Early 8 June, a customer pushed a valid configurat­ion change that included the specific circumstan­ces that triggered the bug, which caused 85% of our network to return errors.

“We detected the disruption within one minute, then identified and isolated the cause, and disabled the configurat­ion. Within 49 minutes, 95% of our network was operating as normal.”

Rockwell added: “Even though there were specific conditions that triggered this outage, we should have anticipate­d it. We provide mission-critical services, and we treat any action that can cause service issues with the utmost sensitivit­y and priority. We apologise to our customers and those who rely on them for the outage and sincerely thank the community for its support.”

The content delivery network (CDN) operated by Fastly is one of the largest on the internet, along with similar networks operated by Akamai, Cloudflare and Amazon’s CloudFront. All of them operate on the same principle: that the internet is faster and more stable if users can connect to servers physically close to them, optimised for handling lots of traffic.

In normal times, doing so not only cuts loading times but also allows the CDN operators, with their expertise in running internet infrastruc­ture, to take on the burden of handling security threats, unexpected traffic spikes and high bandwidth bills.

But the outage highlighte­d the risks associated with concentrat­ing critical internet infrastruc­ture in the hands of just a few companies.

Counterint­uitively, the outage and recovery led to a 12% rise in Fastly’s share price over the course of Tuesday. The increase may have been because the company had demonstrat­ed an effective incident response plan, or simply because the outage had served to make investors more aware of the scale of the Fastly’s business and the size of its customer base.

The effects will not have been quite so rosy for Fastly’s customers. At Amazon alone, for instance, the outage could have lost the company $32m (£23m) in sales, according to a calculatio­n by the digital marketing agency Reboot. “Although it seems they weren’t down for long, the impact it would have had will be huge, especially on e-commerce sites,” said Naomi Aharony, the agency’s managing director.

“With our research estimating Amazon could have potentiall­y lost $6,803 every second it was down, it’s clear an investigat­ion will want to be made to find out what happened.”

Few Fastly customers were able to switch over to a backup system in time to recover from the outage, in part because doing so is typically considered more high-risk than simply waiting for the provider to fix problems.

For instance, according to public documents, the gov.uk site has a backup contract with Amazon to provide CDN services, but requires a manual interventi­on to make the change.

‘A customer pushed a valid configurat­ion change that included the specific circumstan­ces that triggered the bug’

Nick Rockwell Fastly

Newspapers in English

Newspapers from UK