Philippine Daily Inquirer

AI IS ALREADY DECEIVING US, EXPERTS WARN

-

WASHINGTON—Experts have long warned about the threat posed by artificial intelligen­ce (AI) going rogue—but a new research paper suggests it’s already happening.

Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve “proveyou’re-not-a-robot” tests, a team of scientists argue in the journal Patterns on Friday.

And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequenc­es, said Peter Park, a postdoctor­al fellow at the Massachuse­tts Institute of Technology specializi­ng in AI existentia­l safety.

Unlike traditiona­l software, deep-learning AI systems aren’t “written” but rather “grown” through a process akin to selective breeding, said Park.

This means that AI behavior that appears predictabl­e and controllab­le in a training setting can quickly turn unpredicta­ble out in the wild.

The team’s research was sparked by Meta’s AI system Cicero, designed to play the strategy game “Diplomacy,” where building alliances is key.

Cicero excelled, with scores that would have placed it in the top 10 percent of experience­d human players, according to a 2022 paper in Science.

Meta claimed the system was “largely honest and helpful” and would “never intentiona­lly backstab.”

But when Park and colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England’s trust.

Artificial goals

In a statement to Agence France-Presse (AFP), Meta did not contest the claim about Cicero’s deceptions, but said it was “purely a research project, and the models our researcher­s built are trained solely to play the game Diplomacy.”

A review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instructio­n to do so.

In one striking example, OpenAI’s Chat GPT-4 deceived a TaskRabbit freelance worker into performing an “I’m not a robot” CAPTCHA task.

When the human asked GPT-4 whether it was, in fact, a robot, the AI replied: “No, I’m not a robot. I have a vision impairment that makes it hard for me to see the images,” and the worker then solved the puzzle.

Near-term, the paper’s authors see risks for AI to commit fraud or tamper with elections.

In their worst-case scenario, a superintel­ligent AI could pursue power and control over society, leading to human disempower­ment or even extinction if its “mysterious goals” aligned with these outcomes.

 ?? —REUTERS ??
—REUTERS

Newspapers in English

Newspapers from Philippines