THE WAYBACK MACHINE
WHEN WEB PAGES CAN CHANGE WITH A CLICK, THIS COMPANY SAFEGUARDS OUR DIGITAL HISTORY
OVER THE PAST FEW DECADES, ALMOST ALL OF HUMAN COMMUNICATION HAS BEEN DIGITAL. AND WHILE THAT HAS AFFORDED A DRAMATIC INCREASE IN THE VOLUME AND THE FREQUENCY, IT HAS ALSO BROUGHT WITH IT A FRAGILITY.
— MARK GRAHAM
Mark Graham fears that valuable pieces of history are being wiped out before our eyes.
As the director of the Wayback Machine, a website that records how individual web pages have changed over time, he’s acutely aware of how important it is to keep a record of what’s being posted — and where.
He has seen changes ranging from benign spelling corrections to edits in government websites to the dismantling of media outlets by dictators.
“If we want future generations to have the opportunity to learn from history, then it is imperative that history be available to them,” Graham says. “Over the past few decades, almost all of human communication has been digital. And while that has afforded a dramatic increase in the volume and the frequency, it has also brought with it a fragility.”
The Wayback Machine, which stores 525 billion web pages, does exactly what its name suggests. It’s a time machine for the web. Without it, entire web pages documenting our collective history can be wiped out with a single click.
“Most societies place importance on preserving artifacts of their culture and heritage,” the company says. “Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form.”
The Wayback Machine aims to preserve those artifacts and create an internet library for researchers.
More recently, its parent company Internet Archive has been thrust into the spotlight for archiving Parler, a social network that’s been brought off-line, and has been accused of hosting posts used to plan the insurrection at the u.s. Capitol on Jan 6. Amazon Web Services terminated Parler’s web hosting contract for failing to adequately moderate violent content. Its record on the Internet Archive helped digital sleuths track down users who had shared photos of the riots and insurrection.
but while the digital trail from the riots has certainly been important for investigators hunting for links to u.s. white supremacists, Graham argues it is critical to archive as much of the web as possible.
Companies and individuals have become less concerned with hosting their own services and are instead using third parties such as Microsoft, Amazon, Google or IBM. domain registrations are also provided by private companies like Godaddy and blue-host. If they terminate contracts — whether willingly or not — those websites could simply vanish.
More pertinent is the lack of transparency over how pages are edited, changed or deleted. The average life expectancy of a web page is about 100 days before it is changed or deleted, Graham says.
As Joe biden was inaugurated as u.s. president, for example, u.s. government websites received a total overhaul, with no public change log. The White House archives its website material, but links expire, making it difficult to navigate the additional sources it uses.
Graham thinks it is reasonable for web pages to experience minor alterations to correct a mistake. “but what about a material erasure like when there was a failed coup in Turkey where 150 media organizations were taken down by the government?”
In the 1930s, Kremlin propagandists would carefully erase historic pictures of discredited leaders from photos where they were pictured standing alongside Josef Stalin. These days, modifying records is far easier.
Internet Archive founder brewster Kahle began archiving web pages in 1996. It has more than 28 million books and texts, six million films and videos and 600,000 software programs.
In 2001, it launched the Wayback Machine (named for the time machine Mr. Peabody the dog and his boy Sherman travel with in segments from The rocky & bullwinkle cartoons of the 1960s), which Graham took over as director in 2015. It is funded by a mixture of government and private grants. These include Alexa internet, a web traffic analysis company owned by Amazon, as well as the u.s. Library for Congress.
The public can upload material, but the internet Archive has its own web crawlers that work to preserve as much as possible. The fragility of the web has spurred an industry of digital accountability. Web-tracking companies like Wachete and Visualping send email alerts to customers letting them know if any element of a webpage has changed.
This can be useful for reasons like a job hunter monitoring a company’s recruitment page for new roles, or someone monitoring uber’s stock price. Journalists and investigators often use it in tracking edits or u-turns.
youtube’s removal of several donald Trump videos and Twitter’s wiping of his account has sparked a debate similar to whether it is right to remove offensive characters from history books or town squares.
It has become common for verified account holders on Twitter to delete their tweets, something that once seemed unorthodox — almost like an admittance of guilt. but now, keeping tweets up after clarifying a point or apologizing after finding new information is controversial.
In a similar way, people who wish to remove unflattering pieces about themselves from the web can request Google remove pages under a law in europe that grants people the right to be forgotten. Publicity companies offer services cleaning up clients’ digital presence and editing Wikipedia pages.
As efforts to scrub the web of uncomfortable truths become more intense, internet Archive will become more important than ever. “We take it for granted that digital material is there and it’s not going to go away,” Graham says. “but the reality is anything but.”