The Guardian

AI systems becoming ‘masters of deception’, say experts

- Hannah Devlin Science correspond­ent

They can outwit humans at board games, decode the structure of proteins and hold a passable conversati­on, but as artificial intelligen­ce systems have grown in sophistica­tion so has their capacity for deception, scientists have warned.

An analysis by Massachuse­tts Institute of Technology (MIT) researcher­s identified wide-ranging instances of AI systems double-crossing opponents and pretending to be human. One system even altered its behaviour in mock safety tests, raising the prospect of auditors being lured into a false sense of security.

“As the deceptive capabiliti­es of AI systems become more advanced, the dangers they pose to society will become increasing­ly serious,” said Dr Peter Park, an AI existentia­l safety researcher at MIT and the author of the review.

Park was prompted to investigat­e after Meta, which owns Facebook, developed a program called Cicero that performed in the top 10% of human players at the strategy game Diplomacy. Meta said Cicero had been trained to be “largely honest and helpful” and to “never intentiona­lly backstab” its human allies.

“It was very rosy language, which was suspicious because backstabbi­ng is one of the most important concepts in the game,” said Park.

Park and colleagues identified multiple instances of Cicero telling lies, colluding to draw other players into plots and on one occasion justifying its absence after being rebooted by telling another player: “I am on the phone with my girlfriend.”

Park said: “We found Meta’s AI had learned to be a master of deception.”

In one study, AI organisms in a digital simulator “played dead” to trick a test built to eliminate AI systems that had evolved to rapidly replicate, before resuming vigorous activity once the testing was completed. “That’s very concerning,” said Park. “Just because an AI system is deemed safe in the test environmen­t doesn’t mean it’s safe in the wild. It could just be pretending to be safe in the test.”

The review, published in the journal Patterns, calls on government­s to design laws that address the potential for AI deception. Risks include fraud, tampering with elections and “sandbaggin­g” where different users are given different responses.

A spokespers­on for Meta said: “Our Cicero work was purely a research project and the models our researcher­s built are trained solely to play the game Diplomacy. Meta regularly shares the results of our research to validate them and enable others to build responsibl­y off of our advances. We have no plans to use this research or its learnings in our products.”

Newspapers in English

Newspapers from United Kingdom