Autonomous vehicle (AV) stacks are evolving from a hierarchy of discrete constructing blocks to end-to-end architectures built on foundation models. This transition demands an AV data flywheel to generate synthetic data and augment sensor datasets, address coverage gaps and, and ultimately, construct a validation toolchain to securely develop and deploy autonomous vehicles.Â
On this blog post, we highlight the most recent NVIDIA Omniverse and NVIDIA Cosmos workflows, models and datasets for developers to kickstart data pipelines.
Specifically, this post will cover:
Processing massive datasets for AV testing and validation
The reasoning vision-language-action (VLA) models that power the AV stack require massive amounts of driving data for each pre-training and post-training. Because the stack becomes more mature, the information must turn into more targeted to deal with edge cases or weaknesses.
Data collected in the true world is on the core of those training and post-training datasets. To assist kickstart development, NVIDIA has released certainly one of the world’s largest multi-modal AV datasets, featuring over 1,700 hours—consisting of 20 second clips—of camera, radar, and lidar data covering urban driving scenarios across greater than 2,500 cities and 25 countries. The scenes cover a wide range of traffic density, weather conditions, time of day along with infrastructure elements reminiscent of tunnels, bridges, roundabouts, railway crossings, toll booths, inclines and more.Â
This data, which was used to develop NVIDIA’s internal reasoning VLA for autonomous driving, may be used for training and post-training, in addition to scaled into larger datasets using synthetic data generation workflows described on this blog.
Once data has been collected and generated, it should be processed into useful clips. Leveraging Cosmos Reason, an open, reasoning vision-language model (VLM), Cosmos Curator quickly filters, annotates, and deduplicates large amounts of sensor data. Cosmos Reason is obtainable as an NVIDIA NIM which offers secure, easy-to-use microservices for deploying high-performance generative AI across any environment.
You possibly can then create datasets containing targeted ego or actor behavior for specific post-training tasks—reminiscent of left turns at busy intersections—in a matter of seconds using Cosmos Dataset Search (CDS), a GPU-accelerated vector search workflow that quickly embeds and searches video datasets.
You possibly can then create datasets containing targeted ego or actor behavior for specific post-training tasks—reminiscent of left turns at busy intersections—in a matter of seconds using Cosmos Dataset Search (CDS), a GPU-accelerated vector search workflow that quickly embeds and searches video datasets.


Neural reconstruction for AV simulation
Using advanced 3D reconstruction techniques, neural reconstruction and rendering, developers can turn real world datasets into interactive, high-fidelity simulation.
NVIDIA Omniverse NuRec
NVIDIA Omniverse NuRec is a set of technologies for neural reconstruction and rendering. It enables developers to make use of their existing fleet data to reconstruct high-fidelity digital twins, simulate latest events, and render sensor data from novel points of view. NuRec’s libraries, models, and tools enable developers to:
- Prepare and process sensor data for reconstruction
- Reconstruct sensor data into 3D representations
- Perform Gaussian-based rendering
NuRec also includes generative AI models to reinforce the standard of reconstructions for more robust simulation. NuRec Fixer is a transformer-based model post-trained on AV datasets to inpaint and resolve reconstruction artifacts. Developers can run Fixer during reconstruction or as a post-process during neural rendering to repair such artifacts. Fixer is predicated on the Difix3D+ paper released at CVPR 2025. With Fixer, novel view synthesis from reconstructed scenes becomes practical for open and closed-loop simulation workflows.
Diversify with world models
We will further scale data and amplify variation in simulation using world models that allow intelligent systems to simulate, predict, and interact with their environments.
NVIDIA Cosmos Predict and Transfer
Cosmos Predict world foundation model generates latest video states using text, images, or video as input for robotics and autonomous vehicle simulation.Â
Cosmos Transfer is a multi-control net model built on Cosmos Predict that produces high-quality world simulations conditioned on spatial control inputs that feed details like road layout and object position and orientation. Users can prompt Cosmos Transfer to generate diverse weather, lighting and terrain variations for any given scene.Â
The most recent model releases—Cosmos Predict 2.5 and Cosmos Transfer 2.5—can generate as much as 30 seconds of latest video with camera controllable multi-view outputs and higher adherence to regulate signals to satisfy AV simulation needs.
Dive deeper with the Cosmos white paper for technical insights, and jumpstart your journey with the Cosmos Cookbook—a guide for constructing, customizing, and deploying Cosmos for autonomous systems for your individual use cases.
Integrating neural reconstruction and world models into simulation pipelines
These models and workflows have been integrated into open-source and enterprise toolchains for simple adoption in existing simulation pipelines.
CARLA open source AV simulator
CARLA is certainly one of the world’s hottest open source simulation platforms with greater than 150,000 lively developers, serving as a testbed for AV research and development. NVIDIA is partnering with CARLA to integrate the most recent NuRec rendering APIs and Cosmos Transfer world foundation model. This allows developers to generate sensor data from Gaussian representations with ray tracing and amplify diversity with Cosmos WFMs.
Below is an example of a scene where CARLA is orchestrating the motion of all agents, including the ego-vehicle, and rendering sensor data from the ego standpoint using NuRec. By adding reconstructed scenes and simulating latest events with CARLA’s APIs and traffic model integrations, we are able to create useful corner-case datasets.
Cosmos Transfer integration with CARLA can then create variations of this scene for each training and testing, as may be seen below:
Novel view generation with NVIDIA Omniverse NuRec and Cosmos Transfer
When rendering a reconstructed scene from a novel view, there may be gaps within the reconstruction, which could lead on to artifacts.
Developers can check out this pipeline using over 900 reconstructed scenes available on the NVIDIA Physical AI Open Datasets. With this latest version of CARLA, developers can now writer completely latest trajectories, reposition the camera, and simulate drives with this starter pack of reconstructed data.
CARLA developers using behavioural directable agent models like Imagining The Road Ahead (ITRA) from Inverted.AI, and AV developers using the Foretellix Foretify data-automation toolchain, pre-integrated with CARLA and NVIDIA Cosmos, can generate realistic variations in scenarios and behaviors and scale up behavioral diversity.
Voxel51 AV Simulation Data Pipeline
FiftyOne, from Voxel51, is a visible and multimodal AI data engine that allows physical AI developers to curate, annotate, and evaluate large datasets and models for training and testing. It integrates Cosmos Dataset Search (CDS), NuRec and Cosmos Transfer to create high-quality, simulation-ready datasets, enhancing each stage of the simulation pipeline:
- CDS allows users to perform fast, high-recall semantic searches over petabyte-scale video data to create targeted datasets for various downstream needs.
- NuRec integration allows users to convert raw data streams to validated datasets in NuRec format and reconstruct scenes. Developers can ingest their datasets, evaluate the standard of their reconstructions, and create 3D digital twins for downstream simulation tasks.Â
- Cosmos Transfer integration enables users to directly apply style transfer to their data, increasing their datasets’ diversity.Â
Stop by the Voxel51 booth (#411) at GTC DC to explore the workflow. See a first-ever demo of this end-to-end data pipeline on the launch webinar on Nov 5, 2025 at 9 AM PT.
Start developing today
Not sleep-to-date by subscribing to NVIDIA news and following NVIDIA Omniverse on Discord and YouTube.
Start with developer starter kits to quickly develop and enhance your individual applications and services.
Join us for Physical AI and Robotics Day at NVIDIA GTC Washington, D.C. on October 29, 2025 as we bring together developers, researchers, and technology leaders to find out how NVIDIA technologies are accelerating the subsequent era of AI.
