One other video-generating AI has emerged in China that’s difficult OpenAI’s ‘Sora’. This time, it’s Zhipu AI, certainly one of China’s leading AI startups.
The South China Morning Post (SCMP) reported on the twenty seventh that Jifu had released the ‘Ying’ model, which might generate a 6-second video in 30 seconds with text and image prompts.
In response to this, Ing can add various options similar to 3D animation, cinematic, oil painting style, and emotional themes similar to tension, liveliness, and loneliness.
Specifically, Jifu said that it could actually be used immediately without limitation on the official website and mobile app. Businesses and developers can utilize the API. Nonetheless, he said that the free version could have long waiting times when usage is high.
This model was explained to be based by itself technologies, similar to the video models ‘CogVideo’ and ‘Relay Diffusion’ which have been developed since 2021. ‘CogVideoX’, which is an upgraded version of this, is the bottom model.
To resolve the content consistency problem, they developed a 3D VAE (Variational Autoencoder) architecture, which compresses video data to 2% of the unique size, significantly reducing training cost and time.
Specifically, CEO Zhang Feng said, “We were inspired by the ‘diffusion transformer (DiT)’ architecture utilized by OpenAI’s Sora,” adding, “The inference speed has been improved, enabling faster video generation.”
It is a transformer architecture that integrates text, time, and space right into a single 3D fusion, and is a approach to achieve alignment between text and video modalities by discarding the present cross-attention module. It also explained that the interaction effect between modalities is optimized through the complete attention mechanism.
He added that the technology is being improved to provide longer videos.
The videos released through YouTube and X (Twitter) don’t seem like of particularly top quality in comparison with recently introduced tools. Nonetheless, they’re receiving positive responses for his or her immediate usability and short creation time.
Jifu is a startup supported by Meituan, China’s largest food delivery company. Last yr, its ‘ChatGLM’ was released, and it became a hot topic after being evaluated as being superior to chatbots from big tech firms similar to Baidu, ByteDance, and Tencent. It also earned the nickname ‘China’s OpenAI’ after receiving 460 billion won in investments from Tencent, Alibaba, and others.
Meanwhile, Kuaishou, which previously released the favored video AI ‘Cling’, announced on the twenty fourth that it might launch a paid service.
Accordingly, free users can create six videos per day, and a fee plan has been introduced that enables users to create as much as 60 and 800 videos per day for 396 yuan (about 75,500 won) and three,996 yuan (about 762,000 won) per yr, respectively.
Reporter Im Dae-jun ydj@aitimes.com