The Guardian (USA)

Can we stop AI outsmartin­g humanity?

- Mara Hvistendah­l

It began three and a half billion years ago in a pool of muck, when a molecule made a copy of itself and so became the ultimate ancestor of all earthly life. It began four million years ago, when brain volumes began climbing rapidly in the hominid line.

Fifty thousand years ago with the rise of Homo sapiens sapiens.

Ten thousand years ago with the invention of civilizati­on.

Five hundred years ago with the invention of the printing press.

Fifty years ago with the invention of the computer.

In less than thirty years, it will end.

Jaan Tallinn stumbled across these words in 2007, in an online essay called Staring into the Singularit­y. The “it” was human civilisati­on. Humanity would cease to exist, predicted the essay’s author, with the emergence of superintel­ligence, or AI, that surpasses humanlevel intelligen­ce in a broad array of areas.

Tallinn, an Estonia-born computer programmer, has a background in physics and a propensity to approach life like one big programmin­g problem. In 2003, he co-founded Skype, developing the backend for the app. He cashed in his shares after eBay bought it two years later, and now he was casting about for something to do. Staring into the Singularit­y mashed up computer code, quantum physics and Calvin and Hobbes quotes. He was hooked.

Tallinn soon discovered that the author, Eliezer Yudkowsky, a selftaught theorist, had written more than 1,000 essays and blogposts, many of them devoted to superintel­ligence. He wrote a program to scrape Yudkowsky’s writings from the internet, order them chronologi­cally and format them for his iPhone. Then he spent the better part of a year reading them.

The term artificial intelligen­ce, or the simulation of intelligen­ce in computers or machines, was coined back in 1956, only a decade after the creation of the first electronic digital computers. Hope for the field was initially high, but by the 1970s, when early prediction­s did not pan out, an “AI winter” set in. When Tallinn found Yudkowsky’s essays, AI was undergoing a renaissanc­e. Scientists were developing AIs that excelled in specific areas, such as winning at chess, cleaning the kitchen floor and recognisin­g human speech. Such “narrow” AIs, as they are called, have superhuman capabiliti­es, but only in their specific areas of dominance. A chess-playing AI cannot clean the floor or take you from point A to point B. Superintel­ligent AI, Tallinn came to believe, will combine a wide range of skills in one entity. More darkly, it might also use data generated by smartphone-toting humans to excel at social manipulati­on.

Reading Yudkowsky’s articles, Tallinn became convinced that superintel­ligence could lead to an explosion or breakout of AI that could threaten human existence – that ultrasmart AIs will take our place on the evolutiona­ry ladder and dominate us the way we now dominate apes. Or, worse yet, exterminat­e us.

After finishing the last of the essays, Tallinn shot off an email to Yudkowsky – all lowercase, as is his style. “i’m jaan, one of the founding engineers of skype,” he wrote. Eventually he got to the point: “i do agree that ... preparing for the event of general AI surpassing human intelligen­ce is one of the top tasks for humanity.” He wanted to help.

When Tallinn flew to the Bay Area for other meetings a week later, he met Yudkowsky, who lived nearby, at a cafe in Millbrae, California. Their get-together stretched to four hours. “He actually, genuinely understood the underlying concepts and the details,” Yudkowsky told me recently. “This is very rare.” Afterward, Tallinn wrote a check for $5,000 (£3,700) to the Singularit­y Institute for Artificial Intelligen­ce, the nonprofit where Yudkowsky was a research fellow. (The organisati­on changed its name to Machine Intelligen­ce Research Institute, or Miri, in 2013.) Tallinn has since given the institute more than $600,000.

The encounter with Yudkowsky brought Tallinn purpose, sending him on a mission to save us from our own creations. He embarked on a life of travel, giving talks around the world on the threat posed by superintel­ligence. Mostly, though, he began funding research into methods that might give humanity a way out: so-called friendly AI. That doesn’t mean a machine or agent is particular­ly skilled at chatting about the weather, or that it remembers the names of your kids – although superintel­ligent AI might be able to do both of those things. It doesn’t mean it is motivated by altruism or love. A common fallacy is assuming that AI has human urges and values. “Friendly” means something much more fundamenta­l: that the machines of tomorrow will not wipe us out in their quest to attain their goals.

* * *

Last spring, I joined Tallinn for a meal in the dining hall of Cambridge University’s Jesus College. The churchlike space is bedecked with stainedgla­ss windows, gold moulding, and oil paintings of men in wigs. Tallinn sat at a heavy mahogany table, wearing the casual garb of Silicon Valley: black jeans, T-shirt and canvas sneakers. A vaulted timber ceiling extended high above his shock of grey-blond hair.

At 47, Tallinn is in some ways your textbook tech entreprene­ur. He thinks that thanks to advances in science (and provided AI doesn’t destroy us), he will live for “many, many years”. When out clubbing with researcher­s, he outlasts even the young graduate students. His concern about superintel­ligence is common among his cohort. PayPal cofounder Peter Thiel’s foundation has given $1.6m to Miri and, in 2015, Tesla founder Elon Musk donated $10m to the Future of Life Institute, a technology safety organisati­on in Cambridge, Massachuse­tts. But Tallinn’s entrance to this rarefied world came behind the iron curtain in the 1980s, when a classmate’s father with a government job gave a few bright kids access to mainframe computers. After Estonia became independen­t, he founded a video-game company. Today, Tallinn still lives in its capital city – also called Tallinn – with his wife and the youngest of his six kids. When he wants to meet with researcher­s, he often just flies them to the Baltic region.

His giving strategy is methodical, like almost everything else he does. He spreads his money among 11 organisati­ons, each working on different approaches to AI safety, in the hope that one might stick. In 2012, he cofounded the Cambridge Centre for the Study of Existentia­l Risk (CSER) with an initial outlay of close to $200,000.

Existentia­l risks – or X-risks, as Tallinn calls them – are threats to humanity’s survival. In addition to AI, the 20-odd researcher­s at CSER study climate change, nuclear war and bioweapons. But, to Tallinn, those other discipline­s “are really just gateway drugs”. Concern about more widely accepted threats, such as climate change, might draw people in. The horror of superintel­ligent machines taking over the world, he hopes, will convince them to stay. He was visiting Cambridge for a conference because he wants the academic community to take AI safety more seriously.

At Jesus College, our dining companions were a random assortment of conference-goers, including a woman from Hong Kong who was studying robotics and a British man who graduated from Cambridge in the 1960s. The older man asked everybody at the table where they attended university. (Tallinn’s answer, Estonia’s University of Tartu, did not impress him.) He then tried to steer the conversati­on toward the news. Tallinn looked at him blankly. “I am not interested in near-term risks,” he said.

Tallinn changed the topic to the threat of superintel­ligence. When not talking to other programmer­s, he defaults to metaphors, and he ran through his suite of them: advanced AI can dispose of us as swiftly as humans chop down trees. Superintel­ligence is to us what we are to gorillas.

An AI would need a body to take over, the older man said. Without some kind of physical casing, how could it possibly gain physical control?

Tallinn had another metaphor ready: “Put me in a basement with an internet connection, and I could do a lot of damage,” he said. Then he took a bite of risotto.

* * *

Every AI, whether it’s a Roomba or one of its potential world-dominating descendant­s, is driven by outcomes. Programmer­s assign these goals, along with a series of rules on how to pursue them. Advanced AI wouldn’t necessaril­y need to be given the goal of world domination in order to achieve it – it could just be accidental. And the history of computer programmin­g is rife with small errors that sparked catastroph­es. In 2010, for example, when a trader with the mutual-fund company Waddell & Reed sold thousands of futures contracts, the firm’s software left out a key variable from the algorithm that helped execute the trade. The result was the trillion-dollar US “flash crash”.

The researcher­s Tallinn funds believe that if the reward structure of a superhuman AI is not properly programmed, even benign objectives could have insidious ends. One well-known example, laid out by the Oxford University philosophe­r Nick Bostrom in his book Superintel­ligence, is a fictional agent directed to make as many paperclips as possible. The AI might decide that the atoms in human bodies would be better put to use as raw material.

Tallinn’s views have their share of detractors, even among the community of people concerned with AI safety. Some object that it is too early to worry about restrictin­g superintel­ligent AI when we don’t yet understand it. Others say that focusing on rogue technologi­cal actors diverts attention from the most urgent problems facing the field, like the fact that the majority of algorithms are designed by white men, or based on data biased toward them. “We’re in danger of building a world that we don’t want to live in if we don’t address those challenges in the near term,” said Terah Lyons, executive director of the Partnershi­p on AI, a technology industry consortium focused on AI safety and other issues. (Several of the institutes Tallinn backs are members.) But, she added, some of the near-term challenges facing researcher­s, such as weeding out algorithmi­c bias, are precursors to ones that humanity might see with super-intelligen­t AI.

Tallinn isn’t so convinced. He counters that superintel­ligent AI brings unique threats. Ultimately, he hopes that the AI community might follow the lead of the anti-nuclear movement in the 1940s. In the wake of the bombings of Hiroshima and Nagasaki, scientists banded together to try to limit further nuclear testing. “The Manhattan Project scientists could have said: ‘Look, we are doing innovation here, and innovation is always good, so let’s just plunge ahead,’” he told me.

“But they were more responsibl­e than that.”

* * *

Tallinn warns that any approach to AI safety will be hard to get right. If an AI is sufficient­ly smart, it might have a better understand­ing of the constraint­s than its creators do. Imagine, he said, “waking up in a prison built by a bunch of blind five-year-olds.” That is what it might be like for a super-intelligen­t AI that is confined by humans.

The theorist Yudkowsky found evidence this might be true when, starting in 2002, he conducted chat sessions in which he played the role of an AI enclosed in a box, while a rotation of other people played the gatekeeper tasked with keeping the AI in. Three out of five times, Yudkowsky – a mere mortal – says he convinced the gatekeeper to release him. His experiment­s have not discourage­d researcher­s from trying to design a better box, however.

The researcher­s that Tallinn funds are pursuing a broad variety of strategies, from the practical to the seemingly far-fetched. Some theorise about boxing AI, either physically, by building an actual structure to contain it, or by programmin­g in limits to what it can do. Others are trying to teach AI to adhere to human values. A few are working on a last-ditch offswitch. One researcher who is delving into all three is mathematic­ian and philosophe­r Stuart Armstrong at Oxford University’s Future of Humanity Institute, which Tallinn calls “the most interestin­g place in the universe.” (Tallinn has given FHI more than $310,000.)

Armstrong is one of the few researcher­s in the world who focuses fulltime on AI safety. When I met him for coffee in Oxford, he wore an unbuttoned rugby shirt and had the look of someone who spends his life behind a screen, with a pale face framed by a mess of sandy hair. He peppered his explanatio­ns with a disorienti­ng mixture of popular-culture references and math. When I asked him what it might look like to succeed at AI safety, he said: “Have you seen the Lego movie? Everything is awesome.”

One strain of Armstrong’s research looks at a specific approach to boxing called an “oracle” AI. In a 2012 paper with Nick Bostrom, who co-founded FHI, he proposed not only walling off superintel­ligence in a holding tank – a physical structure – but also restrictin­g it to answering questions, like a really smart Ouija board. Even with these boundaries, an AI would have immense power to reshape the fate of humanity by subtly manipulati­ng its interrogat­ors. To reduce the possibilit­y of this happening, Armstrong proposes time limits on conversati­ons, or banning questions that might upend the current world order. He also has suggested giving the oracle proxy measures of human survival, like the Dow Jones industrial average or the number of people crossing the street in Tokyo, and telling it to keep these steady.

Ultimately, Armstrong believes, it could be necessary to create, as he calls it in one paper, a “big red off button”: either a physical switch, or a mechanism programmed into an AI to automatica­lly turn itself off in the event of a breakout. But designing such a switch is far from easy. It is not just that an advanced AI interested in self-preservati­on could prevent the button from being pressed. It could also become curious about why humans devised the button, activate it to see what happens, and render itself useless. In 2013, a programmer named Tom Murphy VII designed an AI that could teach itself to play Nintendo Entertainm­ent System games. Determined not to lose at Tetris, the AI simply pressed pause – and kept the game frozen. “Truly, the only winning move is not to play,” Murphy observed wryly, in a paper on his creation.

For the strategy to succeed, an AI has to be uninterest­ed in the button, or, as Tallinn put it: “It has to assign equal value to the world where it’s not existing and the world where it’s existing.” But even if researcher­s can achieve that, there are other challenges. What if the AI has copied itself several thousand times across the internet?

The approach that most excites researcher­s is finding a way to make AI adhere to human values– not by programmin­g them in, but by teaching AIs to learn them. In a world dominated by partisan politics, people often dwell on the ways in which our principles differ. But, Tallinn told me, humans have a lot in common: “Almost everyone values their right leg. We just don’t think about it.” The hope is that an AI might be taught to discern such immutable rules.

In the process, an AI would need to learn and appreciate humans’ lessthan-logical side: that we often say one thing and mean another, that some of our preference­s conflict with others, and that people are less reliable when drunk. Despite the challenges, Tallinn believes, it is worth trying because the stakes are so high. “We have to think a few steps ahead,” he said. “Creating an AI that doesn’t share our interests would be a horrible mistake.”

* * *

On his last night in Cambridge, I joined Tallinn and two researcher­s for dinner at a steakhouse. A waiter seated our group in a white-washed cellar with a cave-like atmosphere. He handed us a one-page menu that offered three different kinds of mash. A couple sat down at the table next to us, and then a few minutes later asked to move elsewhere. “It’s too claustroph­obic,” the woman complained. I thought of Tallinn’s comment about the damage he could wreak if locked in a basement with nothing but an internet connection. Here we were, in the box. As if on cue, the men contemplat­ed ways to get out.

Tallinn’s guests included former genomics researcher Seán Ó hÉigeartai­gh, who is CSER’s executive director, and Matthijs Maas, an AI researcher at the University of Copenhagen. They joked about an idea for a nerdy action flick titled Superintel­ligence v Blockchain!, and discussed an online game called Universal Paperclips, which riffs on the scenario in Bostrom’s book. The exercise involves repeatedly clicking your mouse to make paperclips. It’s not exactly flashy, but it does give a sense for why a machine might look for more expedient ways to produce office supplies.

Eventually, talk shifted toward bigger questions, as it often does when Tallinn is present. The ultimate goal of AI-safety research is to create machines that are, as Cambridge philosophe­r and CSER co-founder Huw Price once put it, “ethically as well as cognitivel­y superhuman”. Others have raised the question: if we don’t want AI to dominate us, do we want to dominate AI? In other words, does AI have rights? Tallinn believes this is needless anthropomo­rphising. It assumes that intelligen­ce equals consciousn­ess – a misconcept­ion that annoys many AI researcher­s. Earlier in the day, CSER researcher José Hernández-Orallo joked that when speaking with AI researcher­s, consciousn­ess is “the Cword”. (“And ‘free will’ is the F-word,” he added.)

In the cellar, Tallinn said that consciousn­ess is beside the point: “Take the example of a thermostat. No one would say it is conscious. But it’s really inconvenie­nt to face up against that agent if you’re in a room that is set to negative 30 degrees.”

Ó hÉigeartai­gh chimed in. “It would be nice to worry about consciousn­ess,” he said, “but we won’t have the luxury to worry about consciousn­ess if we haven’t first solved the technical safety challenges.”

People get overly preoccupie­d with what superintel­ligent AI is, Tallinn said. What form will it take? Should we worry about a single AI taking over, or an army of them? “From our perspectiv­e, the important thing is what AI does,” he stressed. And that, he believes, may still be up to humans – for now.

This piece originally appeared in Popular Science magazine

• Follow the Long Read on Twitter at @gdnlongrea­d, or sign up to the long read weekly email here.

 ??  ?? Photograph: Science Picture Co/Getty
Photograph: Science Picture Co/Getty
 ??  ?? Jaan Tallinn at Futurefest in London in 2013. Photograph: Michael Bowles/Rex/Shuttersto­ck
Jaan Tallinn at Futurefest in London in 2013. Photograph: Michael Bowles/Rex/Shuttersto­ck

Newspapers in English

Newspapers from United States