BD SIGNPOST
In Hollywood, technological advances help filmmakers but could lead to a glut of doctored videos involving politicians or pornography
Deepfakes: the quest for real digital people: In Hollywood, technology advances help filmmakers but could lead to a glut of doctored videos involving politicians or pornography.
Every Hollywood actor is desperate to cling to their youth. Now, Will Smith, the star of
Independence Day
and Men in Black, can be 23 forever. But unlike his Botoxed peers, the secret of Smith’s fresh face is a new breed of digital doppelgänger, offering unprecedented realism.
In Gemini Man, his latest blockbuster, the 51-year-old actor plays a retired assassin whose younger clone is sent to kill him. The 23-year-old Smith clone, known in the movie as Junior, is not the real actor hidden under layers of make-up or prosthetics. Instead, he is a completely digital recreation, constructed from his skeleton to the tips of his eyelashes by New Zealand-based visual effects studio Weta Digital.
Hollywood insiders estimate that the Junior character alone cost tens of millions of dollars to make — perhaps twice as much as hiring the real Will Smith. Yet just a few weeks before
Gemini Man’s premiere, another, far cheaper, digital clone of Will Smith appeared in a reboot of 1999’s hit science fiction movie The Matrix. In a two-minute YouTube video, Smith took the place of Keanu Reeves to play Matrix hero Neo, taking the red pill and pausing bullets in midair.
The clip was made without
Gemini Man’s $138m budget. Instead its creator, a YouTuber known only as Sham00k, employed free software called DeepFaceLab to superimpose Smith’s face onto Reeves’s within The Matrix footage. “Deepfakes” like these have been used to turn comedian Jordan Peele into Barack Obama, or actor Bill Hader into Tom Cruise, with each clip more believable than the last.
Deepfakes and the high-end effects seen in Gemini Man offer two paths to manipulating people in videos. But as the two techniques converge, the cost of a fully digital human is plunging. The “uncanny valley” is finally being bridged, prompting some in Silicon Valley to wonder when virtual assistants, such as Alexa, will no longer be just a disembodied voice.
“The price of realism has dropped dramatically in the last 20 years,” says Paul Franklin, co-founder and creative director at award-winning visual effects studio DNEG. “Things that were the domain of companies like DNEG can now be done with off-the-shelf software. It’s inevitable [that] the kinds of techniques in Gemini
Man will be stock-in-trade in the next 10 years.”
That it has never been easier for Weta wannabes to insert people into short videos has led to warnings from politicians, privacy activists and Hollywood itself. Convincing fake videos could be used to manipulate electorates, defraud companies or bully individuals — even if for now deepfakers’ principal hobby is to insert unwitting celebrities into pornography.
A report in September from Deeptrace Labs, a cybersecurity start-up whose technology detects manipulated videos, found that the number of deepfakes posted online had almost doubled in the past six months to 14,678. Of those, 96% are classified as porn.
“It’s definitely evolving fast,” says Katja Bego, a data scientist who is researching deepfakes at Nesta, a tech-focused nonprofit organisation.
Facebook, Google and Microsoft have driven efforts to improve deepfake detection, hoping to prevent misleading videos from spreading across their networks.
Creating realistic digital people the traditional Hollywood way is still a daunting task. Bringing Junior to life took “hundreds of hours of painstaking animators’ and modellers’ time”, says Stuart Adcock, head of facial motion at Weta, which was founded by
Lord of the Rings director Peter Jackson. “At times it felt more like we were making a real human from the ground up than a visual effect.”
But with advances in machine learning and processing power available on smartphones and cloud computing systems, some predict that Gemini Man-style effects could one day become as accessible as selfie-retouching smartphone apps such as Facetune are today.
“Deepfakes are the next step in a long chain of the democratisation of media production,” says Peter Rojas, a venture capital investor at Betaworks Ventures. “Deepfakes are the democratisation of computergenerated imagery. It’s not that different to what blogging did for publishing.”
Deepfakes are barely two years old but the biggest change in recent months is the amount of input data required to create a convincing video. In September Chinese app Zao caused a viral sensation by allowing users to trade places with Leonardo DiCaprio in a selection of scenes from movies such as Titanic. Because Zao’s range of clips is limited and preselected, the process takes just a few seconds and requires only a single photograph of the faceswapper.
“Before, it was easy to do this for celebrities and politicians because you have a ton of moving footage for them [on the internet],” Bego says. “Now you just need one picture of a normal person.”
Despite the pace of deepfakes’ progress, traditional Hollywood effects studios such as Weta see little application for the technology in today’s blockbusters.
Deepfakes may be popping up on smartphones in YouTube clips and Facebook feeds, but in
Gemini Man, Junior’s digital face is shown in lingering close-ups across a vast Imax screen. While the effect is more convincing in scenes set in dark catacombs than in bright sunlight, it has nonetheless been hailed as a breakthrough for human realism. The difference is obvious even from the effects of two or three years ago, such as Princess Leia’s brief appearance in the 2016 Star
Wars spin-off, Rogue One.
“The really tricky thing is the way the human face moves, that has been a holy grail for visual effects forever,” says Franklin. DNEG has worked on films including The Avengers series and Ex Machina.
“We are all experts in what faces look like,” he says. “If something is even slightly off — if the muscles around the mouth don’t move correctly or the eyes don’t look in the right direction
— we all know about it instantly.” That is why many deepfakes are still easy to spot.
Achieving the level of quality seen in Gemini Man or
Avengers: Endgame’s “Smart Hulk” is costly and timeconsuming. “In high-end visual effects, we price it out in millions of dollars per minute,” says Franklin. “It’s incredibly labour intensive.” Even on television shows and video games, where budgets are typically more constrained, a “virtual human” effect might come with a six-figure bill.
In Hollywood, that investment pays off if audiences flock to see it on the big screen. Jesse Sisgold, president and COO at Skydance Media, one of the production companies behind Gemini Man, says the film’s “revolutionary technology establishes a new benchmark for the theatrical experience”.
The first year of production on Gemini Man at Weta was spent building a digital version of Will Smith as he is now. It included a model of his skull, a photogrammetric map of his skin pores and face lines, and just the right mix of digital oil and water for his eyes to look real. Then, Weta’s Adcock explains, his team compared that model to a 23-year-old “skin double”, as well as drawing on footage from Smith’s 1990s movies and photos of him as young as eight years old, to determine how features such as his nose, chin and jaw had aged.
Adding to the challenge, Adcock says, was living up to the audience’s memories of Smith, who has been a familiar face to millions in especially the US since The Fresh Prince of
Bel-Air first aired in 1990. For one shot, director Ang Lee asked the Weta team to make Junior look as though he was a “ruthless assassin” but sympathetic enough that the audience would still want to “sit down and enjoy a nice warm bowl of chicken soup” with the character.
“We wrestled with that concept for a while before finally landing on the recipe,” says Adcock. Weta made a “small tweak” to the epicanthic fold, where the upper eyelid meets the inner corner of the eye, and put “more softness” in the eyes.
“Technically it’s a huge challenge but there are also many creative choices at play to make shots work,” he says. “It’s a balance of art and science. We can’t just have one-click solutions.”
The process used by Weta and other studio effects is at odds with the idea of fully automated deepfakes — and points to a broader challenge with artificial intelligence systems. Deep learning and neural networks are “black boxes” that take data as input and spit out a result, without explaining what happens in between.
“Deepfakes allow you to get a result that is convincing in some cases but imagine art directing eye behaviour from one frame to the next,” says Adcock. “That is the level of control we need.”
Suranga Chandratillake, a tech investor at Balderton Capital, says today’s deepfake creation systems are fragmented and incomplete. Despite the promise of instant fakery, the best-quality examples still require a lot of manual fine-tuning to ensure a convincing clip.
“When you read the hyperbolic stuff that the world is going to change [due to deepfakes], that depends on it being really good and instant. That just can’t be done,” Chandratillake says. “I’m not sure the current approach will ever get you there.”
This “man behind the curtain” problem affects other artificial intelligence-led systems, such as self-driving cars, he adds. Automation can get you 90% of the way there, but manual intervention is still required to reach the desired destination safely.
That adds to another challenge for deepfake producers today: their almost complete lack of a business model or corporate sponsorship. “The interesting hurdle [to overcome] would be if there is progress in commercialising this,” says Bego. “There is not that much money being pumped into making these much better.”
That may be starting to change. Despite many in the visual effects industry dismissing deepfakes as a gimmick, the first Hollywood movie to incorporate the technique was released earlier in 2019 — without audiences even noticing.
Deepfakery shaved several years off British actor Bill Nighy in Pokémon Detective Pikachu, according to Tim Webber, chief creative officer of Framestore, the visual effects group that worked on the movie adaptation of the video-game franchise.
“The reason we ended up using deepfake was partly a wish to experiment with it,” says Webber. “We had played around with it before, not terribly seriously, and it hadn’t worked.”
Just as models in fashion magazines might be given the Photoshop treatment, “deageing” techniques are widely used (though not often advertised) to digitally airbrush Hollywood stars. De-ageing plays a prominent role in Netflix’s forthcoming crime drama The Irishman, to make Robert De Niro and Al Pacino look younger in flashback scenes. In most cases, unlike the fully digital Junior in Gemini
Man, de-ageing involves computer-generated image
models being melded with, or pasted on top of, standard camera footage from the actors.
In Pokémon Detective
Pikachu, though, Framestore’s deepfakes tinkering made it to the big screen. Nighy’s character, Howard Clifford, is de-aged for just a few seconds, when his younger self is shown in a low-resolution archival news clip in the opening sequences.
“We were only doing deageing on a few shots so it wasn’t worth us building a full computer-generated model of an actor’s face,” Webber says. “We could use a younger picture of that actor to train the deepfake model.”
While development has been “incredibly rapid”, Webber says, it is “a little hard to predict how things will progress”. The free, open-source nature of deepfake software — and the underground community who use it — is holding back commercialisation, he says.
That could be where Silicon Valley comes in. Apple, Google and Facebook, as well as games developers such as Fortnite maker Epic, have been hiring talent from California-based visual effects companies such as Industrial Light & Magic, founded by Star Wars creator George Lucas, and Pixar, the Disney-owned computer animation pioneer.
Last week it emerged that Apple had acquired UK-based iKinema, which specialises in “full body” motion capture for games and films.
That has the visual effects industry speculating about what this might lead to — from upgrades to avatars, such as Apple’s personalised Memoji or Snap’s Bitmoji, to visualisations in full mode of digital assistants such as Alexa and Siri.
The tech companies “sit in the middle” between big budget Hollywood-style effects and the DIY feel of deepfakes, says Steve Caulkin, chief technical officer at Cubic Motion, which works on digital animation for video games, TV and films.
“They potentially have the means to create pretty high-end digital humans.”
Combining Silicon Valley’s vast data troves and artificial intelligence expertise with Hollywood visual effects could mean that one day, every smartphone owner has their own private version of Gemini
Man’s Junior — a realistic avatar to represent them in the digital world.
“What I’m excited about,” Smith joked at a preview screening of the film earlier in 2019, “is there’s a completely digital 23-year-old version of myself I can make movies with now.”
DEEPFAKES ARE THE DEMOCRATISATION OF COMPUTERGENERATED IMAGERY ... NOT DIFFERENT TO WHAT BLOGGING DID FOR PUBLISHING
THAT THE WORLD IS GOING TO CHANGE [DUE TO DEEPFAKES], DEPENDS ON IT BEING REALLY GOOD AND INSTANT. IT JUST CAN’T BE DONE