Constructing truly photorealistic 3D environments for simulation is difficult. Even with advanced neural reconstruction methods equivalent to 3D Gaussian Splatting (3DGS) and 3D Gaussian with Unscented Transform (3DGUT), rendered views can still contain artifacts equivalent to blurriness, holes, or spurious geometry—especially from novel viewpoints. These artifacts significantly reduce visual quality and might impede downstream tasks.Â
NVIDIA Omniverse NuRec brings real-world sensor data into simulation and features a generative model, often known as Fixer, to tackle this problem. Fixer is a diffusion-based model built on the NVIDIA Cosmos Predict world foundation model (WFM) that removes rendering artifacts and restores detail in under-constrained regions of a scene.
This post walks you thru use Fixer to rework a loud 3D scene right into a crisp, artifact-free environment ready for autonomous vehicle (AV) simulation. It covers using Fixer each offline during scene reconstruction and online during rendering, using a sample scene from the NVIDIA Physical AI open datasets on Hugging Face.
Step 1: Download a reconstructed sceneÂ
To start, discover a reconstructed 3D scene that exhibits some artifacts. The PhysicalAI-Autonomous-Vehicles-NuRec dataset on Hugging Face provides over 900 reconstructed scenes captured from real-world drives. First log in to Hugging Face and comply with the dataset license. Then download a sample scene, provided as a USDZ file containing the 3D environment. For instance, using the Hugging Face CLI:
pip install huggingface_hub[cli] # install HF CLI if needed
hf auth login
# (After huggingface-cli login and accepting the dataset license)
hf download nvidia/PhysicalAI-Autonomous-Vehicles-NuRec
--repo-type dataset
--include "sample_set/25.07_release/Batch0005/7ae6bec8-ccf1-4397-9180-83164840fbae/camera_front_wide_120fov.mp4"
--local-dir ./nurec-sample
This command downloads the scene’s preview video (camera_front_wide_120fov.mp4) to your local machine. Fixer operates on images, not USD or USDZ files directly, so using the video frames provides a convenient set of images to work with.Â
Next, extract frames with FFmpeg and use those images as input for Fixer:
# Create an input folder for Fixer
mkdir -p nurec-sample/frames-to-fix
# Extract frames
ffmpeg -i "sample_set/25.07_release/Batch0005/7ae6bec8-ccf1-4397-9180-83164840fbae/camera_front_wide_120fov.mp4"
-vf "fps=30"
-qscale:v 2
"nurec-sample/frames-to-fix/frame_%06d.jpeg"
Video 1 is the preview video showcasing the reconstructed scene and its artifacts. On this case, some surfaces have holes or blurred textures as a result of limited camera coverage. These artifacts are exactly what Fixer is designed to deal with.
Step 2: Arrange the Fixer environmentÂ
Next, arrange the environment to run Fixer.Â
Before proceeding, make sure that you might have Docker installed and GPU access enabled. Then complete the next steps to arrange the environment.
Clone the Fixer repository
This obtains the needed scripts for subsequent steps:
git clone https://github.com/nv-tlabs/Fixer.git
cd Fixer
Download the pretrained Fixer checkpoint
The pretrained Fixer model is hosted on Hugging Face. To fetch this, use the Hugging Face CLI:
# Create directory for the model
mkdir -p models/
# Download only the pre-trained model to models/
hf download nvidia/Fixer --local-dir models
It will save the required files needed for inference in Step 3 to the models/ folder.
Step 3: Use online mode for real-time inference with Fixer
Online mode refers to using Fixer as a neural enhancer during rendering for fixing each frame through the simulation. Use the pretrained Fixer model for inference, which may run contained in the Cosmo2-Predict Docker container.
Note that Fixer enhances rendered images out of your scene. Make sure that your frames are exported (for instance, into examples/) and pass that folder to --input.
To run Fixer on all images in a directory, run the next steps:
# Construct the container
docker construct -t fixer-cosmos-env -f Dockerfile.cosmos .
# Run inference with the container
docker run -it --gpus=all --ipc=host
-v $(pwd):/work
-v /path/to/nurec-sample/frames-to-fix:/input
--entrypoint python
fixer-cosmos-env
/work/src/inference_pretrained_model.py
--model /work/models/pretrained/pretrained_fixer.pkl
--input /input
--output /work/output
--timestep 250
Details about this command include the next:
- The present directory is mounted into the container at /work, allowing the container to access the files
- The directory is mounted within the frames extracted from the sample video through FFmpeg
- The script inference_pretrained_model.py (from the cloned Fixer repo src/ folder) loads the pre-trained Fixer model from the given path
--inputis the folder of input images (for instance, examples/ incorporates some rendered frames with artifacts)--outputis the folder where enhanced images might be saved (where output is specified)--timestep250 represents the noise level the model uses for the denoising process
After running this command, the output/ directory will contain the fixed images. Note that the primary few images may process more slowly because the model initializes, but inference will speed up for subsequent frames once the model is running.Â
Step 4: Evaluate the output
After applying Fixer to your images, you possibly can evaluate how much it improved your reconstruction quality. This post reports Peak Signal-to-Noise Ratio (PSNR), a standard metric for measuring pixel-level accuracy. Table 1 provides an example before/after comparison of the sample scene.
| Metric | Without Fixer | With Fixer |
| PSNR ↑ (accuracy) | 16.5809 | 16.6147 |
Note that when you try using other NuRec scenes from the Physical AI Open Datasets, or your personal neural reconstructions, you possibly can measure the standard improvement of Fixer with the metrics. Check with the metrics documentation for instructions on compute these values.Â
In qualitative terms, scenes processed with Fixer look significantly more realistic. Surfaces that were previously smeared are actually reconstructed with plausible details, fantastic textures equivalent to road markings turn into sharper, and the improvements remain consistent across frames without introducing noticeable flicker.Â
Moreover, Fixer is effective at correcting artifacts when novel view synthesis is introduced. Video 3 shows the applying of Fixer to a NuRec scene rendered from a novel viewpoint obtained by shifting the camera 3 meters to the left. When run on top of the novel view synthesis output, Fixer reduces view-dependent artifacts and improves the perceptual quality of the reconstructed scene.Â
Summary
This post walked you thru downloading a reconstructed scene, organising Fixer, and running inference to scrub rendered frames. The final result is a sharper scene with fewer reconstruction artifacts, enabling more reliable AV development.Â
To make use of Fixer with Robotics NuRec scenes, download a reconstructed scene from the PhysicalAI-Robotics-NuRec dataset on Hugging Face and follow the steps presented on this post.
To learn more, take a look at how:
