Silicon Valley’s mimicry machines are trying to erase authors – we must not allow it

The march of generative AI has put copyright of original reporting and writing in the spotlight

2024-01-01 - ANDREW orlowski

Silicon Valley reacts to criticism like a truculent toddler throwing its toys out of the pram. But acquiring a bit of humility and self-discipline may be just what the child needs most. So the US tech industry should regard a lawsuit filed last week as a great learning experience.

The New York Times filed a copyright infringement case against Microsoft and Open AI. The evidence presented alleges that CHATGPT created nearidentical copies of The Times’ stories on demand, without the user first paying a subscription or seeing any advertising on The Times’ site. CHATGPT “recites Times content verbatim, closely summarises it, and mimics its expressive style”, the lawsuit explains.

In other words, the value of the material that the publisher generates is entirely captured by the technology company, which has invested nothing in creating it. This was exactly the situation that led to the creation of copyright in the Statute of Anne in 1710, which first established the legal right to copyright for an author. Then, it was the printing monopoly that was keeping all the dosh. The concept of an author, a subjective soul who viewed the world in a unique way, really arrived with the Enlightenment.

Now, the nerds of Silicon Valley want to erase it again. Attempts to do just that have already made them richer than anything a Stationers’ Guild member could imagine. “Microsoft’s deployment of Times-trained LLMS throughout its product line helped boost its market capitalisation by trillions of dollars in the past year alone,” the lawsuit notes, adding that Openai’s value has shot from zero to $90bn (£70bn). With Open AI’S CHATGPT models now built into so many Microsoft products, this is a mimicry engine built on a global scale.

More ominously, the lawsuit also offers an abundance of evidence that “these tools wrongly attribute false information to The Times”. The bots introduce errors that weren’t there in the first place, it claims. They “hallucinate”, to use the Cambridge Dictionary’s word of the year.

Publishers who are anxious about the first concern – unauthorised reproduction – should be even more concerned about the second. Would a publisher be happy to see their outlet’s name next to a CHATGPT News response that confidently asserts, for example, that Iran has just launched cruise missiles at US destroyers? Or at London? These are hypotheticals but being the newspaper that accidentally starts a third world war is not good for the brand in the long run.

Some midwit pundits and academics portrayed the lawsuit merely as a tactical licensing gambit. This year both Associated Press and the German giant Axel Springer have both cut licensing deals with Openai. The New York Times is just sabre-rattling in pursuit of a better deal, so the argument goes.

In response to the lawsuit, Openai insisted it respects “the rights of content creators and owners and [is] committed to working with them to ensure they benefit from AI technology and new revenue models”.

However, the industry is worried about much more than money. Take, for example, the fact that the models that underpin CHATGPT need only to hear a couple of seconds of your child’s voice to clone it. AI does not need to return the next day to perfect it. After that, it has a free hand to do what it will with its newfound ability. So, the economic value of a licensing deal is impossible to estimate beforehand. And once done, it cannot be undone. As one publishing technology executive puts it, “you can’t unbake the cake”.

Previous innovations in reproduction, from the photocopier to Napster, were rather different beasts, as the entrepreneur and composer Ed Newton-rex noted this week. Past breakthroughs were purely mechanical or technological changes. But this new generation of AI tools marry technology with knowledge. “They only work because their developers have used that copyrighted content to train on,” he wrote on Twitter, since rebranded as X.

Publishers and artists are entitled to think that without their work, AI would be nothing. This is why the large AI operations – and the investors hoping to make a killing from them – should be getting very nervous now.

They have been negligent in ignoring the issue until now. “Until recently, AI was a research community that enjoyed benign neglect from copyright holders who felt it was bad form to sue academics,” veteran AI journalist Timothy B Lee wrote recently on Twitter. “This gave a lot of AI researchers the mistaken impression that copyright law didn’t apply to them. It doesn’t seem out of the question that AI companies could lose these cases catastrophically and be forced to pay billions to plaintiffs and rebuild their models from scratch.”

Would wipe-and-rebuild be such a bad thing? Today’s generative AI is just a very early prototype. Many more prototypes may be developed and thrown away until a satisfactory design emerges. A ground-up rebuild can in some cases be the best thing that can happen to a technology product.

There’s certainly plenty of room for improvement with this new generation of AI models. A Stanford study of CHATGPT looking at how reliable the chatbot was when it came to medicine found that less than half (41pc) of the responses in clinical conditions agreed with the known answer according to a consensus of physicians. The AI gave lethal advice 7pc of the time.

A functioning democracy needs original reporting and writing so that we all benefit from economic incentives for creativity. We must carry on that Enlightenment tradition of original expression. Some may find such arguments pompous. But there are bigger issues at stake.

A society that gives up on respect for individual expression, and chooses to worship a mimicry machine instead, probably deserves the fate that inevitably awaits.

Silicon Valley’s mimicry machines are trying to erase authors – we must not allow it

The march of generative AI has put copyright of original reporting and writing in the spotlight

Newspapers in English

Newspapers from United Kingdom