Meta has released ‘VFusion3D’, a model that generates high-quality 3D content from a single image or text description. It features training with synthetic data generated by a synthetic intelligence (AI) video model to unravel the issue of insufficient 3D training data.
On the ninth (local time), VentureBeat reported on the 3D generation model ‘VFusion3D’ fine-tuned by Meta and Oxford University researchers using a video generation AI model. Post your paper within the archiveIt was reported that.
VFusion3D presents a novel method for constructing scalable 3D generative models by leveraging pre-trained video diffusion models.
The most important obstacle to developing 3D generative foundation models is the dearth of 3D data. Unlike text, images, and videos, 3D data is difficult to acquire.

To deal with this problem, we leverage a video diffusion model trained on an enormous amount of text, images, and videos as a source of 3D data.
We fine-tuned the prevailing video diffusion model to generate multi-view video sequences from various angles, and trained the VFusion3D model with 3 million of those synthetic sequences.
In consequence, the model can generate 3D objects from a single image in seconds. It also achieved a 90% advantage in human evaluator preference tests in comparison with state-of-the-art 3D generation models.

The research team emphasized the ‘scalability’ of VFusion3D.
“As more powerful video AI models are developed and more 3D data becomes available for fine-tuning, the performance of VFusion3D will proceed to rapidly improve,” he said.
Meanwhile, the technology to generate 3D assets based on 2D images or text prompts is a field that many corporations have jumped into. It’s because there may be high demand for it in areas equivalent to game production and e-commerce, and it might generate revenue instantly.
In March, NVIDIA introduced ‘Latte3D’, which might generate 3D objects and animal images in real time in only one second from an easy text prompt, and Stability AI also released an AI model capable of manufacturing 360-degree 3D rendering videos.
There’s also a recent competition to extend the speed of 3D object creation.
Last month, Meta introduced ‘3D Gen’, an AI tool that generates high-quality 3D assets in only one minute with text descriptions. Then, earlier this month, Stability AI unveiled ‘Stable Fast 3D (SF3D)’ model that generates 3D videos in only 0.5 seconds.
Reporter Park Chan cpark@aitimes.com