Unveiling the Strengths and Limitations of AI-Generated Video: Insight from Sora-Powered Short’s Creators
OpenAI’s video generation tool Sora surprised the AI community in February with its incredibly natural and realistic video generation, far surpassing competitor models. However, the debut excluded a lot of information about Sora, information that has been shared by a filmmaker who was granted early access to utilize Sora in the creation of a short film.
Shy Kids – a digital production group based in Toronto was selected by OpenAI as one of the few groups to generate short films primarily for advertising OpenAI. However, they were granted a good deal of creative control in the production of “air head.” Interviewed by a visual effects news provider, fxguide, post-production artist Patrick Cederberg discussed his experience using Sora in his work.
An important takeaway is this: while OpenAI’s post featuring the short films suggests that the films were directly produced by Sora, they were in fact professionally produced, involving extensive storyboarding, editing, colour correcting, and post work such as rotoscoping and VFX. Just like Apple’s “shot on iPhone” phrase, which conceals the professional setup, lighting, and post-production work, the Sora post only discusses what it enables people to do but not how they accomplish it.
If you find Cederberg’s interview intriguing and are keen to learn more in a non-technical way, wander over to fxguide and give it a read. However, some enlightening facts about utilizing Sora indicate that, though impressive, the model might not be as revolutionary as initially perceived.
Control is a highly coveted yet elusive concept at this juncture. The best solution we have found is to be extremely specific in our prompts. Describing character attire and the nature of the balloon allowed us to maintain some consistency, as there are no inherent mechanics for ensuring continuity from one scene to the next.
Tasks that filmmakers find straightforward such as choosing a character’s outfit color, require intricate workarounds in a generative system, because each scene is generated independently. Though this could change, it undoubtedly adds a layer of complexity at the moment.
The Sora outputs also had to be scrutinized for any unwanted elements. Cederberg noted recurring issues with the generated model, which often included unwanted faces on the balloon that served as the protagonist’s head or unnecessary strings dangling at the front. These had to be removed post-generation – a laborious task – if the prompts didn’t successfully eliminate them.
Achieving exact timings and character or camera movements poses another challenge: “There’s a slight degree of control regarding when actions occur during the generation but it’s far from precise…it’s somewhat of a gamble,” Cederberg admitted.
For instance, the process of timing a gesture like a wave is very approximate and driven by suggestion, contrasting with manual animations. A shot such as a pan upward on the character’s body might not convey what the filmmaker intended. Consequently, the team rendered a shot composed in portrait orientation and performed a crop pan in post-production. Often, the produced clips were in slow motion for no discernable reason.
The shot, initially as it came out of Sora, and how it eventually appeared in the short. Credit: Shy Kids
Interestingly, the standard language of filmmaking, terms like “panning right” or “tracking shot”, were generally inconsistent, according to Cederberg. The team found this inconsistency quite surprising.
“Before they reached out to artists to experiment with the tool, the researchers hadn’t been thinking from a filmmaker’s perspective,” he commented.
The team managed to conduct hundreds of generations, each lasting 10 to 20 seconds, but only a few were actually used. According to Cederberg, the ratio was approximately 300:1. This figure could likely catch many off guard when comparing it to a conventional shoot.
There’s indeed a brief video that provides a behind-the-scenes glimpse into some of the challenges the team faced, for anyone interested. Just like a lot of content that’s somewhat related to AI, the comments are largely critical of the entire project. However, they’re not quite as harsh as the AI-supported advertisement that was recently thrashed.
Another intriguing aspect relates to copyright: Suppose you request Sora to generate a “Star Wars” clip. In that case, it will decline. And if you attempt to bypass it, say with “robed man with a laser sword on a retro-futuristic spaceship,” it will decline too, somehow recognising your intention. It even rejected an “Aronofsky type shot” or a “Hitchcock zoom.”
Although the refusal is logical in many ways, it does beg the question: If Sora is aware of these genres, does it indicate that the system was trained using similar content to help it identify potential copyright infringements? OpenAI, which is quite secretive about its training data (to the point of absurdity, as with the interview CTO Mira Murati had with Joanna Stern), will most likely keep us in the dark.
Regarding Sora and its application in filmmaking, it indeed stands as a potent and beneficial device in its niche, yet its role is not “fabricating films from the ground up.” As an adversary had earlier notably stated, “that will follow.”
Discover the pinnacle of WordPress auto blogging technology with AutomationTools.AI. Harnessing the power of cutting-edge AI algorithms, AutomationTools.AI emerges as the foremost solution for effortlessly curating content from RSS feeds directly to your WordPress platform. Say goodbye to manual content curation and hello to seamless automation, as this innovative tool streamlines the process, saving you time and effort. Stay ahead of the curve in content management and elevate your WordPress website with AutomationTools.AI—the ultimate choice for efficient, dynamic, and hassle-free auto blogging. Learn More