‘Cat-and-mouse game’ creates believable fake photos
The woman in the photo seems familiar. She looks like Jennifer Aniston, the “Friends” actress, or Selena Gomez, the child star turned pop singer. But not exactly.
She appears to be a celebrity, one of the beautiful people photographed outside a movie première or an awards show. And yet you cannot quite place her.
That’s because she’s not real. She was created by a machine.
The image is one of the faux celebrity photos generated by software under development at Nvidia, the big-name computer chipmaker that is investing heavily in research involving artificial intelligence.
At a lab in Finland, a small team of Nvidia researchers recently built a system that can analyze thousands of (real) celebrity snapshots, recognize common patterns and create new images that look much the same — but are still a little different. The system can also generate realistic images of horses, buses, bicycles, plants and many other common objects.
The project is part of a vast and varied effort to build technology that can automatically generate convincing images — or alter existing images in equally convincing ways. The hope is that this technology can significantly accelerate and improve the creation of computer interfaces, games, movies and other media, eventually allowing software to create realistic imagery in moments rather than the hours — if not days — it can now take human developers.
In recent years, thanks to a breed of algorithm that can learn tasks by analyzing vast amounts of data, companies like Google and Facebook have built systems that can recognize faces and common objects with an accuracy that rivals the human eye. Now, these and other companies, alongside many of the world’s top academic AI labs, are using similar methods to both recognize and create.
Nvidia’s images can’t match the resolution of images produced by a top-of-the-line camera, but when viewed on even the largest smartphones, they are sharp, detailed and, in many cases, remarkably convincing.
Like other prominent AI researchers, the
Nvidia team believes the techniques that drive this project will continue to improve in the months and years to come, generating significantly larger and more complex images.
“We think we can push this further, generating not just photos but 3-D images that can be used in computer games and films,” said Jaakko Lehtinen, one of the researchers behind the project.
As it built a system that generates new celebrity faces, the Nvidia team went a step further in an effort to make them far more believable. It set up two neural networks — one that generated the images and another that tried to determine whether those images were real or fake. These are called generative adversarial networks, or GANs. In essence, one system does its best to fool the other — and the other does its best not to be fooled.
“The computer learns to generate these images by playing a cat-and-mouse game against itself,” Lehtinen said.
A second team of Nvidia researchers recently built a system that can automatically alter a street photo taken on a summer’s day so that it looks like a snowy winter scene. Researchers at the University of California, Berkeley, have designed another that learns to convert horses into zebras and Monets into van Goghs. DeepMind, a Londonbased AI lab owned by Google, is exploring technology that can generate its own videos. And Adobe is fashioning similar machine learning techniques with an eye toward pushing them into products like Photoshop, its popular image design tool.
But new concerns come with the power to create this kind of imagery.
With so much attention on fake media these days, we could soon face an even wider range of fabricated images than we do today.
“The concern is that these techniques will rise to the point where it becomes very difficult to discern truth from falsity,” said Tim Hwang, who previously oversaw AI policy at Google and is now director of the Ethics and Governance of Artificial Intelligence Fund, an effort to fund ethical AI research. “You might believe that accelerates problems we already have.”
Researchers are also using a wide range of other machine learning techniques to edit video in more convincing — and sometimes provocative — ways.
In August, a group at the University of Washington made headlines when it built a system that could put new words into the mouth of a Barack Obama video. Others, including Pinscreen, a California startup, and iFlyTek of China, are developing similar techniques using images of President Donald Trump.
The results are not completely convincing. But the rapid progress of GANs and other techniques point to a future where it becomes easier for anyone to generate faux images or doctor the real thing. That is cause for real concern among experts like Hwang.
Many of us still put a certain amount of trust in photos and videos that we don’t necessarily put in text or word of mouth. Hwang believes the technology will evolve into a kind of AI arms race pitting those trying to deceive against those trying to identify the deception.
Lehtinen agrees that as time goes on, we may have to rethink the very nature of imagery. “We are approaching some fundamental questions,” he said.