Bangkok Post

Outsourcin­g language

Finally, a machine that can finish your sentences

- CADE METZ NYT NEWS SERVICE

In August, researcher­s from the Allen Institute for Artificial Intelligen­ce, a lab based in Seattle, unveiled an English test for computers. It examined whether machines could complete sentences like this one: On stage, a woman takes a seat at the piano. She a) sits on a bench as her sister plays with

the doll. b) smiles with someone as the music plays. c) is in the crowd, watching the dancers. d) nervously sets her fingers on the keys.

For you, that would be an easy question. But for a computer, it was pretty hard. While humans answered more than 88% of the test questions correctly, the lab’s AI systems hovered around 60%. Among experts — those who know just how difficult it is to build systems that understand natural language — that was an impressive number.

Then, two months later, a team of Google researcher­s unveiled a system called Bert. Its improved technology answered those questions just as well as humans did — and it was not even designed to take the test.

Bert’s arrival punctuated a significan­t developmen­t in artificial intelligen­ce. Over the past several months, researcher­s have shown that computer systems can learn the vagaries of language in general ways and then apply what they have learned to a variety of specific tasks.

Built in quick succession by several independen­t research organisati­ons, including Google and the Allen Institute, these systems could improve technology as diverse as digital assistants like Alexa and Google Home, and software that automatica­lly analyses documents in law firms, hospitals, banks and other businesses.

“Each time we build new ways of doing something close to human-level, it allows us to automate or augment human labour,” said Jeremy Howard, founder of Fast.ai, an independen­t lab based in San Francisco that is among those at the forefront of this research. “This can make life easier for a lawyer or a paralegal. But it can also help with medicine.”

It may even lead to technology that can — finally — carry on a decent conversati­on.

But there is a downside. On social-media services like Twitter, this new research could also lead to more convincing bots designed to fool us into thinking they are human, Howard said.

Researcher­s have shown that rapidly improving AI techniques can facilitate the creation of fake images that look real. As these kinds of technologi­es move into the language field as well, Howard said, we may need to be more sceptical than ever about what we encounter online.

These new language systems learn by analysing millions of sentences written by humans. A system built by OpenAI, a lab based in San Francisco, analysed thousands of self-published books, including romance novels, science fiction and more. Google’s Bert analysed these same books plus the length and breadth of Wikipedia.

Each system learned a particular skill by analysing all that text. OpenAI’s technology learned to guess the next word in a sentence. Bert learned to guess missing words anywhere in a sentence. But in mastering these specific tasks, they also learned about how language is pieced together.

If Bert can guess the missing words in millions of sentences (such as “The man walked into a store and bought a ____ of milk”), it can also understand many of the fundamenta­l relationsh­ips between words in the English language, said Jacob Devlin, the Google researcher who oversaw the creation of Bert. (Bert is short for Bidirectio­nal Encoder Representa­tions from Transforme­rs.)

The system can apply this knowledge to other tasks. If researcher­s provide Bert with a bunch of questions and their answers, it learns to answer other questions on its own. Then, if they feed it news headlines that describe the same event, it learns to recognise when two sentences are similar. Usually, machines can recognise only an exact match.

Bert can handle the “common sense” test from the Allen Institute. It can also handle a reading comprehens­ion test wherein it answers questions about encyclopae­dia articles. What is oxygen? What is precipitat­ion? In another test, it can judge the sentiment of a movie review. Is the review positive or negative?

This kind of technology is “a step toward a lot of still-faraway goals in AI, like technologi­es that can summarise and synthesise big, messy collection­s of informatio­n to help people make important decisions”, said Sam Bowman, a professor at New York University who specialise­s in naturallan­guage research.

In the following weeks after the release of OpenAI’s system, outside researcher­s applied it to conversati­on. An independen­t group of researcher­s used OpenAI’s technology to create a system that leads a competitio­n to build the best chatbot that was organised by several top labs, including the Facebook AI Lab. And this month, Google “open-sourced” its Bert system, so others can apply it to additional tasks. Devlin and his colleagues have already trained it in 102 languages.

Sebastian Ruder, a researcher based in Ireland who collaborat­es with Fast.ai, sees the arrival of systems like Bert as a “wake-up call” for him and other AI researcher­s because they had assumed language technology had hit a ceiling. “There is so much untapped potential,” he said.

The complex mathematic­al systems behind this technology are called neural networks. In recent years, this type of machine-learning has accelerate­d progress in subjects as varied as face-recognitio­n technology and driverless cars. Researcher­s call this “deep learning”.

Bert succeeded in part because it leaned on enormous amounts of computer-processing power that was not available to neural networks in years past. It analysed all those Wikipedia articles over the course of several days using dozens of computer processors built by Google specifical­ly for training neural networks.

The ideas that drive Bert have been around for years, but they started to work because modern hardware could juggle much larger amounts of data, Devlin said.

But there is reason for scepticism that this technology can keep improving quickly, because researcher­s tend to focus on the tasks they can make progress on and avoid the ones they can’t, said Gary Marcus, a New York University psychology professor who has long questioned the effectiven­ess of neural networks. “These systems are still a really long way from truly understand­ing running prose,” he said.

This can make life easier for a lawyer or a paralegal. It can also help with medicine

 ??  ?? Jacob Devlin, the Google researcher who oversaw thecreatio­n of Bert — a system that understand­s naturallan­guage — at the Google offices in Seattle.
Jacob Devlin, the Google researcher who oversaw thecreatio­n of Bert — a system that understand­s naturallan­guage — at the Google offices in Seattle.
 ??  ??
 ??  ??
 ??  ??

Newspapers in English

Newspapers from Thailand