Business Spotlight

Text Generator

Um Missbrauch vorzubeuge­n, werden die Forschungs­ergebnisse einer Entwicklun­g im Bereich künstliche­r Intelligen­z vorerst zurückgeha­lten. ALEX HERN berichtet.

-

The risks of AI writing

The creators of a revolution­ary AI (artificial intelligen­ce) system that can write news stories and works of fiction — called “deepfakes for text” — have taken the unusual step of not releasing their research publicly, for fear of potential misuse. Openai, a non-profit research company backed by Elon Musk and others, says its new AI model, called GPT2, is so good and the risk of malicious use so high that it is breaking with its normal practice of releasing the full research to the public in order to allow more time to discuss the ramificati­ons of the breakthrou­gh.

At its core, GPT2 is a text generator. The AI system is fed text, anything from a few words to a whole page, and asked to write the next few sentences based on its prediction­s of what should come next. The system is pushing the boundaries of what was thought possible, both in terms of the quality of the output, and the wide variety of potential uses.

When used simply to generate new text, GPT2 is capable of writing plausible passages that match what it is given in both style and subject. It rarely shows any of the quirks shown by previous AI systems, such as forgetting what it is writing about midway through a paragraph, or mangling the syntax of long sentences.

The AI wrote a new passage of fiction set in China after being fed the opening line of Nineteen Eighty-four by George Orwell: “It was a bright cold day in April, and the clocks were striking 13.” The system recognized the vaguely futuristic tone and the novelistic style, and continued with: “I was in my car on my way to a new job in Seattle. I put the gas in, put the key in, and then I let it run. I just imagined what the day would be like. A hundred years from now. In 2045, I was a teacher in some school in a poor part of rural China. I started with Chinese history and history of science.”

Feed it the first few paragraphs of a Guardian story about Brexit, and its

output is plausible newspaper prose, complete with “quotes” from Labour Party leader Jeremy Corbyn, mentions of the Irish border and answers from the prime minister’s spokesman.

One such, completely artificial, paragraph reads: “Asked to clarify the reports, a spokesman for May said: ‘The PM has made it absolutely clear her intention is to leave the EU as quickly as is possible and that will be under her negotiatin­g mandate as confirmed in the Queen’s speech last week.’”

From a research standpoint, GPT2 is groundbrea­king in two ways. One is its size, according to Dario Amodei, Openai’s research director. The models “were 12 times bigger, and the data set was 15 times bigger and much broader” than the previous state-of-the-art AI model, Amodei says. It was trained on a data set containing about ten million articles, selected by searching the social news site Reddit. The vast collection of text weighed in at 40 GB, enough to store about 35,000 copies of Herman Melville’s classic, Moby Dick.

The amount of data that GPT2 was trained on directly influenced its quality, giving it more knowledge of how to understand written text. It also led to the second breakthrou­gh: GPT2 is far more general-purpose than previous text models. By structurin­g the text that is inputted, it can perform tasks including translatio­n and summarizat­ion, and pass simple reading comprehens­ion tests, often performing better than other AIS that have been built for those tasks.

That quality, however, has also led Openai to go against its remit of pushing AI forward and instead to keep GPT2 behind closed doors while it assesses what malicious users might be able to do with it. “We need to perform experiment­ation to find out what they can and can’t do,” said Jack Clark, the charity’s head of policy. “If you can’t anticipate all the abilities of a model, you have to prod it to see what it can do. There are many more people than us who are better at thinking what it can do maliciousl­y.”

To show what that means, Openai made one version of GPT2 with a few slight tweaks that can be used to generate infinite positive — or negative — reviews of products. Spam and fake news are two other obvious potential malicious uses. As it is trained on the internet, it is not hard to encourage it to generate bigoted text, conspiracy theories and so on.

Instead, the goal is to show what is possible in order to prepare the world for what will be mainstream in a year or two’s time. “I have a term for this. The escalator from hell,” Clark said. “It’s always bringing the technology down in cost and down in price. The rules by which you can control technology have fundamenta­lly changed. We’re not saying we know the right thing to do here, we’re not laying down the line and saying ‘this is the way’. … We are trying to develop more rigorous thinking here. We’re trying to build the road as we travel across it.”

“We’re trying to build the road as we travel across it”

 ??  ?? Future shock: will machines soon do the writing for humans?
Future shock: will machines soon do the writing for humans?

Newspapers in English

Newspapers from Austria