Toronto Star

Researcher­s are training AI to take our advice

In bid to keep the unexpected at bay, lab makes algorithms that accept human guidance

- CADE METZ THE NEW YORK TIMES

SAN FRANCISCO— At OpenAI, the artificial intelligen­ce lab founded by Tesla’s chief executive, Elon Musk, machines are teaching themselves to behave like humans. But sometimes, this goes wrong.

Sitting inside OpenAI’s San Francisco offices on a recent afternoon, researcher Dario Amodei showed off an autonomous system that taught itself to play Coast Runners, an old boat-racing video game. The winner is the boat with the most points that also crosses the finish line.

The result was surprising: The boat was far too interested in the little green widgets that popped up on the screen. Catching these widgets meant scoring points. Rather than trying to finish the race, the boat went point-crazy. It drove in endless circles, colliding with other vessels, skidding into stone walls and repeatedly catching fire.

Amodei’s burning boat demonstrat­ed the risks of the AI techniques that are rapidly remaking the tech world. Researcher­s are building machines that can learn tasks largely on their own. This is how Google’s DeepMind lab created a system that could beat the world’s best player at the ancient game of Go. But as these machines train themselves through hours of data analysis, they may find their way to unexpected, unwanted and perhaps harmful behaviour.

That’s a concern as such techniques move into online services, security devices and robotics. Now, a small community of AI researcher­s, including Amodei, is starting to explore mathematic­al techniques that aiming to keep the worst from happening.

At OpenAI, Amodei and his col- league Paul Christiano are developing algorithms that can not only learn tasks through hours of trial and error, but also receive guidance from human teachers along the way.

With a few clicks here and there, the researcher­s have a way of showing the autonomous system that it needs to win points in Coast Runners while also moving toward the finish line. They believe that these kinds of algorithms — a blend of human and machine instructio­n — can help keep automated systems safe.

For years, Musk, along with other pundits, philosophe­rs and technologi­sts, have warned machines could spin outside our control and learn malicious behaviour their designers didn’t anticipate. At times, these warnings have seemed overblown, given that today’s autonomous car systems can even get tripped up by the most basic tasks, such as recognizin­g a bike lane or a red light.

But researcher­s such as Amodei are trying to get ahead of the risks. In some ways, what these scientists are doing is a bit like a parent teaching a child right from wrong.

Many specialist­s in the AI field believe a technique called reinforcem­ent learning — a way for machines to learn specific tasks through extreme trial and error — could be a primary path to artificial intelligen­ce. Researcher­s specify a particular reward the machine should strive for and as it navigates a task at random, the machine keeps close track of what brings the reward and what doesn’t. When OpenAI trained its bot to play Coast Runners, the reward was more points.

This video-game training has realworld implicatio­ns. If a machine can learn to navigate a racing game such as Grand Turismo, researcher­s believe, it can learn to drive a real car. This is why Amodei and Christiano are working to build reinforcem­ent learning algorithms that accept human guidance along the way. This can ensure systems don’t stray from the task at hand.

Much of this work is still theoretica­l. But given the rapid progress of AI techniques and their growing importance across so many industries, researcher­s believe starting early is the best policy.

“There’s a lot of uncertaint­y around exactly how rapid progress in AI is going to be,” said Shane Legg, who oversees the AI safety work at DeepMind.

 ?? CHRISTIE HEMM KLOK/THE NEW YORK TIMES ?? Researcher­s with OpenAI are developing algorithms that can learn tasks through trial and error and receive advice.
CHRISTIE HEMM KLOK/THE NEW YORK TIMES Researcher­s with OpenAI are developing algorithms that can learn tasks through trial and error and receive advice.

Newspapers in English

Newspapers from Canada