After the successful launch of the text-to-image model, the controversial text-to-music model, and the text generation model largely launched, Stability AI recently announced the release of Stable Video Diffusion, Text. A video-to-video tool that aims to capture video clips from scratch.
“Stable video streaming [is] An implicit video transmission model for high-quality state-of-the-art text-to-video and image-to-video generation,” explains Stability AI in the model's research paper and in the official announcement. AI for stability is a testament to its commitment to amplifying human intelligence.
This adaptability, coupled with open source technology, paves the way for many applications in advertising, education, and entertainment. Stable video transmission, currently in research preview, “can outperform image-based methods at a fraction of their calculated budget,” researchers said.
The technical capabilities of stable video transmission are impressive. “Human choice studies show that the resulting model is superior to state-of-the-art image-to-video models,” the research paper states. Stability clearly relies on the model's superiority in converting static images to dynamic video content, saying the model beats closed-loop models in user preference studies.
Stable AI has developed two models under the Stable Video Diffusion umbrella: SVD and SVD-XT. The SVD model still converts images to 576×1024 videos at 14 frames, while the SVD-XT uses the same architecture but expands to 24 frames. Both models offer video generation at frame rates of three to 30 frames per second, sitting on the cutting edge of open source text-to-video technology.
In the fast-growing field of AI video generation, Stable Video Diffusion competes with newer models developed by Pika Labs, Runway, and Meta. The latter's recently announced Emu Video, similar to its text-to-video capabilities, shows great potential for its unique approach to image editing and video creation, although it's currently limited to 512×512 pixel resolution videos.
Despite its technological breakthroughs, Stability AI runs into challenges, including ethical considerations about using copyrighted data in AI training. The company emphasizes that the model is “not intended for real-world or commercial applications at this stage,” focusing on refining it based on community feedback and security concerns.
With SD 1.5 and SDX – the most powerful open source models for image generation – this new innovation comes to the video generation scene, revealing that the lines between imagination and reality are not only blurred, but beautifully redrawn. .
Edited by Ryan Ozawa.
Stay on top of crypto news, get daily updates in your inbox.