What do we know about Sora, OpenAI’s new text-to-video generator?

2024-02-19 -

THE MAKER of ChatGPT is now diving into AI-generated video.

Meet Sora — OpenAI’s new text-to-video generator. The tool, which the San Francisco company unveiled Thursday, uses generative artificial intelligence to instantly create short videos based on written commands.

Sora isn’t the first to demonstrate this kind of technology. But industry analysts point to the high quality of the tool’s videos displayed so far, and note that its introduction marks a significant leap for both OpenAI and the future of text-to-video generation overall.

Still, as with all things in the rapidly growing AI space today, such technology also raises fears about potential ethical and societal implications. Here’s what you need to know.

WHAT IS SORA? CAN I USE IT YET?

Sora is a text-to-video generator — creating videos up to 60 seconds long based on written prompts using generative AI. The model can also generate video from an existing still image.

Generative AI is a branch of AI that can create something new. Examples include chatbots, like OpenAI’s ChatGPT, and imagegenerators such as DALL-E and Midjourney. Getting an AI system to generate videos is newer and more challenging, but relies on some of the same technology.

Sora isn’t available for public use yet (OpenAI says it’s engaging with policymakers and artists before officially releasing the tool) and there’s a lot we still don’t know. But since Thursday’s announcement, the company has shared a handful of examples of Sora-generated videos to show off what it can do.

OpenAI CEO Sam Altman also took to X, the platform formerly known as Twitter, to ask social media users to send in prompt ideas. He later shared realistically detailed videos that responded to prompts like“two golden retrievers podcasting on top of a mountain“and “a bicycle race on ocean with different animals as athletes riding the bicycles with drone camera view.”

While Sora-generated videos can depict complex, incredibly detailed scenes, OpenAI notes that there are still some weaknesses — including some spatial and causeand-effect elements. For example, OpenAI adds on its website, “a person might take a bite out of a cookie, but afterwards, the cookie may not have a bite mark.”

ARE THERE OTHER AIGENERATED VIDEO TOOLS?

OpenAI’s Sora isn’t the first of its kind. Google, Meta and the startup Runway ML are among companies that have demonstrated similar technology.

Still, industry analysts stress the apparent quality and impressive length of Sora videos shared so far. Fred Havemeyer, head of US AI and software research at Macquarie, said that Sora’s launch marks a big step forward for the industry.

“Not only can you do longer videos, I understand up to 60 seconds, but also the videos being created look more normal and seem to actually respect physics and the real world more,” Havemeyer said. “You’re not getting as many ‘uncanny valley’ videos or fragments on the video feeds that look ... unnatural.”

While there has been “tremendous progress” in AI-generated video over the last year — including Stable Video Diffusion’s introduction last November — Forrester senior analyst Rowan Curran said such videos have required more “stitching together” for character and scene consistency.

The consistency and length of Sora’s videos, however, represent “new opportunities for creatives to incorporate elements of AI-generated video into more traditional content, and now even to generate full-blown narrative videos from one or a few prompts,” Curran told The Associated Press via email Friday.

WHAT ARE THE POTENTIAL RISKS?

Although Sora’s abilities have astounded observers since Thursday’s launch, anxiety over ethical and societal implications of AI-generated video uses also remains.

Havemeyer points to the substantial risks in 2024’s potentially fraught election cycle, for example. Having a “potentially magical” way to generate videos that may look and sound realistic presents a number of issues within politics and beyond, he added — pointing to fraud, propaganda and misinformation concerns.

“The negative externalities of generative AI will be a critical topic for debate in 2024,” Havemeyer said. “It’s a substantial issue that every business and every person will need to face this year.”

Tech companies are still calling the shots when it comes to governing AI and its risks as governments around the world work to catch up. In December, the European Union reached a deal on the world’s first comprehensive AI rules, but the act won’t take effect until two years after final approval.

On Thursday, OpenAI said it was taking important safety steps before making Sora widely available.

“We are working with red teamers — domain experts in areas like misinformation, hateful content, and bias — who will be adversarially testing the model,” the company wrote. “We’re also building tools to help detect misleading content such as a detection classifier that can tell when a video was generated by Sora.”

OpenAI’s Vice President of Global Affairs Anna Makanju reiterated this when speaking Friday at the Munich Security Conference, where OpenAI and 19 other technology companies pledged to voluntarily work together to combat AI-generated election deepfakes. She noted the company was releasing Sora “in a manner that is quite cautious”.

At the same time, OpenAI has revealed limited information about how Sora was built. OpenAI’s technical report did not disclose what imagery and video sources were used to train Sora — and the company did not immediately respond to a request for further comment Friday.

The Sora release also arrives amid the backdrop of lawsuits against OpenAI and its business partner Microsoft by some authors and The New York Times over its use of copyrighted works of writing to train ChatGPT. OpenAI pays an undisclosed fee to the AP to licence its text news archive.

AP The OpenAI logo is displayed on a cell phone with an image on a computer monitor generated by ChatGPT’s Dall-E text-to-image model.

What do we know about Sora, OpenAI’s new text-to-video generator?

WHAT IS SORA? CAN I USE IT YET?

ARE THERE OTHER AIGENERATED VIDEO TOOLS?

WHAT ARE THE POTENTIAL RISKS?

Newspapers in English

Newspapers from Jamaica

What do we know about Sora, OpenAI’s new text-to-video generator?

WHAT IS SORA? CAN I USE IT YET?

ARE THERE OTHER AIGENERATE­D VIDEO TOOLS?

WHAT ARE THE POTENTIAL RISKS?

Newspapers in English

Newspapers from Jamaica

ARE THERE OTHER AIGENERATED VIDEO TOOLS?