Meta has unveiled Movie Gen, a sophisticated suite of AI models designed to create and manipulate videos, audio, and images, aiming to enhance personal creativity while surpassing existing video synthesis technologies.
On Friday, Meta unveiled a preview of its latest technological innovation, Movie Gen, a suite of advanced AI models capable of crafting and manipulating video, audio, and images. This cutting-edge development enables the creation of realistic videos from a single photograph of a person, and according to the company, these models significantly surpass other current video-synthesis technologies in human evaluations.
Meta has not specified when or how these capabilities will become publicly available, but it positions Movie Gen as a tool to boost personal creativity rather than supplant traditional artists and animators. The company perceives potential applications such as simplifying the creation and editing of “day in the life” videos for social media, and producing tailor-made animated birthday messages.
Building upon Meta’s earlier forays into video synthesis with projects like 2022’s Make-A-Scene video generator and the Emu image-synthesis model, Movie Gen uses text prompts to produce bespoke videos complete with sound. The system not only generates videos but also allows users to edit existing ones and transform static images of individuals into dynamic, realistic videos.
In benchmarking tests with human participants, Meta reports that Movie Gen surpasses competitors like Google’s Veo, released in May, as well as OpenAI’s Sora, Runway Gen-3, and Kling, a Chinese video model. The Movie Gen model proficiently creates 1080p high-definition videos lasting up to 16 seconds at a rate of 16 frames per second, all based on text descriptions or image inputs. Meta asserts that the model adeptly manages intricate tasks such as object motion, interactions between subjects and objects, and various camera movements.
An example of Movie Gen’s capabilities includes a video generated from the prompt: “A ghost in a white bedsheet faces a mirror. The ghost’s reflection can be seen in the mirror. The ghost is in a dusty attic, filled with old beams, cloth-covered furniture. The attic is reflected in the mirror. The light is cool and natural. The ghost dances in front of the mirror.” This highlights the model’s ability to synthesize complex visual scenes.
Despite the promising advancements, the quality of outputs from AI video generators like Movie Gen can vary and may heavily depend on the dataset of example videos used during the model’s training process. Moreover, achieving coherent results often requires multiple attempts, as cherry-picked samples typically might not reflect the median output quality.
Meta’s ongoing exploration in video synthesis reflects a broader trend in AI-driven creativity tools, positioning Movie Gen as a noteworthy addition to the continually evolving landscape of digital multimedia creation.
Source: Noah Wire Services


