San Francisco Chronicle

Digital archive turns 20

Nonprofit organizati­on tracks, preserves history of the Internet that reflects our changing culture

- By Benny Evangelist­a

When the Internet Archive was created 20 years ago, few envisioned how a small galaxy of about 500,000 websites would evolve into the center of human communicat­ion and culture.

Now, the nonprofit San Francisco organizati­on — which celebrated the milestone with a party Wednesday night — curates a vast digital archive that includes more than 370 million websites and 273 billion pages, many captured before they disappeare­d forever.

It’s more than an archive of Internet sites. The organizati­on, founded by computer scientist and entreprene­ur Brewster Kahle, now has a virtual storehouse ranging from digitally converted books and historic film to funny memes and audio recordings of Grateful Dead

concerts.

Future scholars will be able to search through an archive of news talk shows and political advertisin­g to better understand the twists and turns of this year’s presidenti­al election season.

“When Brewster started this, a lot of people thought he was crazy or irrelevant,” said Rick Prelinger, a film archivist and associate professor of film and digital media at UC Santa Cruz.

“First off, who thought the Web was anything that needed to be saved back in 1996, ’97 or ’98?” he said. “It was just screens you looked at. I don’t think anybody anticipate­d that our culture would move online so rapidly. He did see that. He’s got a good instinct for that kind of thing.”

About 600 people turned out for the party in the Internet Archive’s neoclassic, Greek-columned home, the former Christian Science church on Funston Avenue in the Richmond District. Guests included early tech entreprene­ur Marc Canter, co-founder of what would become Macromedia, early Apple employee Dan Kottke, and Washington journalist Kathy Kiely.

The crowd included past and present Internet Archive employees, and others who volunteere­d their time or money to help the organizati­on over the years.

Kahle’s goal was to create the digital version of the Great Library of Alexandria, the lost repository of the ancient world. He believed that preserving the particular­ly ephemeral World Wide Web, as it was then called, would be key for future historians to be able to understand the contexts of this time.

“The Web is a sharing extravagan­za of people trusting each other with who they are, and making it public,” Kahle said. “We wanted to make that permanent.”

About the same time, he co-founded Alexa Internet, a Web research and informatio­n company that also derived its name from that ancient library. Amazon bought Alexa in 1999 in a deal worth about $250 million.

Kahle “had a reasonable vision of the future and a path to getting there,” said Ronna Tanenbaum, Alexa’s former head of design. “He was trying to protect, preserve and create universal access to knowledge.”

The Internet Archive has survived through community donations and by working with about 1,000 libraries around the world that pay the group to help digitize books and other material. But the site itself remains free.

“It is an organizati­on that gives things away,” Kahle said. “Who does that? The interestin­g thing is that ‘free’ works so well on the Web.”

The archive is best known for the Wayback Machine, which uses computer algorithms to crawl the Web and constantly save snapshots of sites. Online visitors use it to compare how websites, like SFGate.com, have changed over the years. It’s also a repository of Internet firms that went belly-up long ago, like Pets.com.

The site’s 3 million to 4 million visitors a day show that “people want old stuff, they want to remember,” Kahle said.

Last week, the archive released an easier way to search the Wayback Machine, which has also helped repair 1 million broken citation links on Wikipedia.

In another section is an archive of about 3 million hours of TV news broadcasts. That includes a searchable database of political ads captured during the current election season, which political scientists 100 years from now might use to figure out what we were thinking during this “craziest, most disruptive election since the Civil War,” said Roger Macdonald, the TV archive’s director.

Or future scholars might find something else in the archive more indicative of our moment in time.

“The most important value of the archive in the future inherently can’t be anticipate­d now,” Macdonald said.

Kahle said the archive has digitized about 2.5 million books, although it’s still a long way from its goal of 10 million books by 2020, and from the Library of Congress. But archive visitors can now search through books it has digitized, a feat not possible until recently.

“Today, our whole world is the Internet and digital material,” said San Francisco historian Woody LaBounty. “And they’re taking on this ridiculous, massive job trying to capture just a small part of it and saving it for posterity. It’s a daunting task.”

 ?? Photos by Carlos Avila Gonzalez / The Chronicle ?? Alexis Rossi chats with John Hauser of Community Media Archive outside the Internet Archive during the 20th anniversar­y event.
Photos by Carlos Avila Gonzalez / The Chronicle Alexis Rossi chats with John Hauser of Community Media Archive outside the Internet Archive during the 20th anniversar­y event.
 ??  ?? Guests assess their scorecards at the celebratio­n, which offered a gift for visiting several of the exhibits and checking them off.
Guests assess their scorecards at the celebratio­n, which offered a gift for visiting several of the exhibits and checking them off.
 ?? Photos by Carlos Avila Gonzalez / The Chronicle ??
Photos by Carlos Avila Gonzalez / The Chronicle
 ??  ?? Above: Larry Dieterich (left) checks out a machine that digitizes books, demonstrat­ed by Tim Bigelow, at the 20th anniversar­y celebratio­n. Left: Brewster Kahle, Founder of the Internet Archive, enjoys an old video of himself.
Above: Larry Dieterich (left) checks out a machine that digitizes books, demonstrat­ed by Tim Bigelow, at the 20th anniversar­y celebratio­n. Left: Brewster Kahle, Founder of the Internet Archive, enjoys an old video of himself.

Newspapers in English

Newspapers from United States