Zeroscope: An Innovative Free and Open-Source Text-to-Video Framework
In Brief
Zeroscope stands out as a groundbreaking open-source text-to-video framework that enhances the original Modelscope’s capabilities by delivering superior resolution and adopting a 16:9 aspect ratio.
It comes in two distinct versions, utilizing offset noise to optimize data distribution and create an array of lifelike video outputs.
A fresh competitor has stepped onto the scene of text-to-video innovation, completely free and within the open-source domain. Zeroscope, a Gen-2 player, seeks to transform textual input into engaging visual narratives.

Starting from the robust ground laid by Modelscope, Zeroscope brings noteworthy advancements. Emphasizing enhanced resolution and achieving a more cinematic 16:9 ratio, Zeroscope elevates the experience of video creation to a more polished and professional level, all without the downsides of watermarked content.
This model is available in two configurations: Zeroscope_v2 567w One is tailored for fast-paced content generation at a resolution of 576x320 pixels, and the other, which excels in upscale capabilities, delivers high-definition videos at 1024x576. The smaller variant requires 7.9 GB of VRAM, making it feasible for many standard graphics cards. Zeroscope_v2 XL Zeroscope’s developmental process revolved around the introduction of offset noise to thousands of video clips accompanied by labeled frames. This strategy enhances the model's grasp of data distribution, enabling it to produce a wider variety of realistic videos that accurately reflect text descriptions.
The developer known as “Cerspense” views Zeroscope as a direct challenge to
the commercial text-to-video competitor. By refining its functionalities and eliminating watermarks, Zeroscope not only presents itself as a compelling open-source substitute but is also completely accessible to the public at no cost. Runway ML’s Gen-2 While Runway’s Gen-2 continues to lead the market in commercial options, the emergence of Zeroscope represents the very first high-caliber open-source model.
Meta has unveiled an AI framework capable of generating videos based on textual input
Read more about AI:
Disclaimer
In line with the Trust Project guidelines Damir leads the team as the product manager and editor at Metaverse Post, focusing on topics like AI/ML, AGI, LLMs, and fields related to Metaverse and Web3. His writing has drawn in a vast audience, exceeding a million readers each month. He has an extensive background in SEO and digital marketing, backed by a decade of experience. Damir’s insights have been featured in prestigious publications such as Mashable, Wired, Cointelegraph, The New Yorker, Inside.com, Entrepreneur, BeInCrypto, and others. As a digital nomad, he frequently travels between the UAE, Turkey, Russia, and the CIS. Holding a bachelor’s degree in physics, he believes this academic foundation equips him with critical thinking skills essential for navigating today’s rapidly evolving internet landscape.