The way to Scale Data Generation for Physical AI with the NVIDIA Cosmos Cookbook

Depth: Maintains 3D realism and spatial consistency by respecting distance and perspective.
Segmentation: Used to completely transform objects, people, or backgrounds.
Edge: Preserves the unique structure, shape, and layout of the video.
Vis: By default, applies a smoothing/blur effect, where the underlying visual characteristics remain unchanged.

Constructing powerful physical AI models requires diverse, controllable, and physically-grounded data at scale. Collecting large-scale, diverse real-world datasets for training might be expensive, time-intensive, and dangerous. NVIDIA Cosmos open world foundation models (WFMs) address these challenges by enabling scalable, high-fidelity synthetic data generation for physical AI and the augmentation of existing datasets.

The NVIDIA Cosmos Cookbook is a comprehensive guide for using Cosmos WFMs and tools. It includes step-by-step recipes for inference, curation, post-training, and evaluation.

For scalable data-generation workflows, the Cookbook includes a wide range of recipes based on NVIDIA Cosmos Transfer, a world-to-world style transfer model.

For scalable data-generation workflows, the Cookbook includes a wide range of recipes based on NVIDIA Cosmos Transfer, a world-to-world style transfer model. On this blog, we’ll sample Cosmos Transfer recipes to alter video backgrounds, add recent environmental conditions to driving data, and generate data for multiple use cases corresponding to robotics navigation and concrete traffic scenarios.