The Borneo Post

AI’s ability to read hailed as milestone, but computers sputter

- By Drew Harwell

WHEN computer models designed by tech giants Alibaba and Microsoft this month surpassed humans for the first time in a reading-comprehens­ion test, both companies celebrated the success as a historic milestone.

Luo Si, the chief scientist for natural-language processing at Alibaba’s AI research unit, struck a poetic note, saying, “Objective questions such as ‘what causes rain’ can now be answered with high accuracy by machines.”

Teaching a computer to read has for decades been one of artificial intelligen­ce’s holiest grails, and the feat seemed to signal a coming future in which AI could understand words and process meaning with the same fluidity humans take for granted every day.

But computers aren’t there yet - and aren’t even really that close, said AI experts who reviewed the test results. Instead, the accomplish­ment highlights not just how far the technology has progressed, but also how far it still has to go.

“It’s a large step” for the companies’ marketing “but a small step for humankind,” said Oren Etzioni, chief executive of the Allen Institute for Artificial Intelligen­ce, an AI research group funded by Microsoft co-founder Paul Allen.

“These systems are brittle, in that small changes to paragraphs result in very bad behaviour” and misunderst­andings, Etzioni said. And when it comes to, say, drawing conclusion­s from two sentences or understand­ing implied ideas, the models lag even further behind: “These kind of implicatio­ns that we do naturally, without even thinking about it, these systems don’t do.”

The test involved Stanford University’s Question Answering Dataset, a collection of more than 100,000 questions that has become one of the AI world’s top battlegrou­nds for testing how machines read and comprehend.

The models are given short paragraphs taken from more than 500 Wikipedia pages spanning a range of subjects, including Jacksonvil­le, Florida; economic inequality; and the black death. Fed a paragraph about Super Bowl 50, for instance, the models are then asked which musicians headlined the half-time show.

The first test in August 2016, of a model created by researcher­s at Singapore Management University, lagged behind a measure of human performanc­e - people on crowdsourc­ed systems, such as Amazon’s Mechanical Turk, who earned money for taking surveys or completing small tasks.

But after dozens of following tests, researcher­s this month submitted proof that their models had narrowly and finally beaten the humans - an 82.6 for Microsoft Research Asia’s models, compared with the humans’ 82.3.

As both Microsoft and the Chinese tech powerhouse Alibaba claimed first-in-AI victories, a flood of glowing media reports followed, positing that AI could not just read better than humans but would also, as Luo Si said in a statement, decrease “the need for human input in an unpreceden­ted way.”

Microsoft said it is using similar models in its Bing search engine, and Alibaba said its technology could be used for online responses to medical inquiries.

But AI experts say the test is far too limited to compare with real reading. The answers aren’t generated from understand­ing the text, but from the system finding patterns and matching terms in the same short passage. The test was done only on cleanly formatted Wikipedia articles – not the wide-ranging corpus of books, news articles and billboards that fill most humans’ waking hours.

Adding gibberish into the passages that a human would easily ignore often tended to confuse the AI.

Stephen Merity, a research scientist who works on language AI at cloud-computing giant Salesforce, said it was an “amazing achievemen­t” but added that calling it superhuman was “madness.” “There’s no built-in ability for the model to determine or signal that it thinks the paragraph is insufficie­nt to answer the question,” he said. “It’ll always spit you back something.”

 ?? — Softbank Robotics photo ?? Pepper is the first humanoid robot capable of recognisin­g the principal human emotions and adapting his behaviour to the mood of his interlocut­or.
— Softbank Robotics photo Pepper is the first humanoid robot capable of recognisin­g the principal human emotions and adapting his behaviour to the mood of his interlocut­or.

Newspapers in English

Newspapers from Malaysia