Autonomous vehicle (AV) research is undergoing a rapid shift. The sphere is being reshaped by the emergence of reasoning-based vision–language–motion (VLA) models that bring human-like considering to AV decision-making. These models may be viewed as implicit world models operating in a semantic space, allowing AVs to unravel complex problems step-by-step and to generate reasoning traces that mirror human thought processes. This shift extends beyond the models themselves: traditional open-loop evaluation is not any longer sufficient to carefully assess such models, and recent evaluation tools are required.
Recently, NVIDIA introduced Alpamayo, a family of models, simulation tools, and datasets to enable development of reasoning-based AV architectures. Our goal is to supply researchers and developers with a versatile, fast, and scalable platform for evaluating, and ultimately training, modern reasoning-based AV architectures in realistic closed-loop settings.
On this blog, we introduce Alpamayo and methods to rise up and running with reasoning-based AV development:
- Part 1: Introducing NVIDIA Alpamayo 1, an open, 10B reasoning VLA model, in addition to methods to use the model to each generate trajectory predictions and review the corresponding reasoning traces.
- Part 2: Introducing the Physical AI dataset, considered one of the biggest and most geographically diverse open AV datasets available that permits training and evaluating these models.
- Part 3: Introducing NVIDIA AlpaSim, an open-source end-to-end simulation tool designed for evaluating end-to-end models
These three key components provide the essential pieces needed to start out constructing reasoning-based VLA models: a base model, large-scale data for training, and a simulator for testing and evaluation.
Part 1: Alpamayo 1, an open reasoning VLA for AVs
Start with the Alpamayo reasoning VLA model in only three steps.
Step 1: Access Alpamayo model weights and code
The Hugging Face repository accommodates pretrained model weights, which may be loaded with the corresponding code on GitHub.
Step 2: Prepare your environment
The Alpamayo GitHub repository accommodates steps to establish your development environment, including establishing uv (if not already installed) and making a Python virtual environment.
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"
# Setup the virtual environment
uv venv ar1_venv
source ar1_venv/bin/activate
# Install pip within the virtual environment (if missing)
./ar1_venv/bin/python -m ensurepip
# Install Jupyter notebook package
./ar1_venv/bin/python -m pip install notebook
uv sync --active
Finally, because the model requires access to gated Hugging Face resources. Request access here:
Then, authenticate with:
and get your Hugging Face token here.
Step 3: Run the Alpamayo reasoning VLA
The model repository features a notebook that may download the Alpamayo model weights, load some example data from the NVIDIA PhysicalAI-AV Dataset, run the model on it, and visualize the output trajectories and their associated reasoning traces.
Specifically, the instance data accommodates the ego-vehicle passing a construction zone, with 4 timesteps (columns) from 4 cameras (front_left, front_wide, front_right, front_tele, respectively in rows) visualized below.


After running this through the Alpamayo model, an example output you might see within the notebook is “Nudge to the left to extend clearance from the development cones encroaching into the lane,” with the corresponding predicted trajectory and ground truth trajectory visualized below.


In case you desire to to supply more trajectories and reasoning traces, please be happy to alter the num_traj_samples=1 argument within the inference call to the next number.
Part 2: Physical AI AV dataset for large-scale, diverse AV data
The PhysicalAI-Autonomous-Vehicles dataset provides considered one of the biggest, most geographically diverse collections of multi-sensor data for AV researchers to construct the subsequent generation of physical AI based end-to-end driving systems.


It accommodates a complete of 1,727 hours of driving recorded in 25 countries and over 2,500 cities (coverage shown below, with color indicating the variety of clips per country). The dataset captures diverse traffic, weather conditions, obstacles, and pedestrians within the environment. Overall, it consists of 310,895 clips which can be each 20 seconds long. The sensor data includes multi-camera and LiDAR coverage for all clips, and radar coverage for 163,850 clips.


To start with the Physical AI AV Dataset, the physical_ai_av GitHub repository accommodates a Python developer kit and documentation (in the shape of a wiki). The truth is, this package was already utilized in Part 1 to load a sample of the dataset for Alpamayo 1.
Part 3: AlpaSim, a closed-loop simulation for AV evaluation
AlpaSim overview


AlpaSim is built on a microservice architecture centered across the Runtime (see Figure 6), which orchestrates all simulation activity. Individual services, equivalent to the Driver, Renderer, TrafficSim, Controller, and Physics, run in separate processes and may be assigned to different GPUs. This design offers two major benefits:
- Clear, modular APIs via gRPC, making it easy to integrate recent services without dependency conflicts.
- Arbitrary horizontal scaling, allowing researchers to allocate compute where it matters most. For instance, if driver inference becomes the bottleneck, simply launch additional driver processes. If rendering is the bottleneck, dedicate more GPUs to rendering. And if a rendering process cannot handle multiple scenes concurrently, you possibly can run multiple renderer instances on the identical GPU to maximise utilization.
But horizontal scaling alone isn’t the complete story. The actual power of AlpaSim lies in how the Runtime enables pipeline parallelism (see Figure 7).
In traditional sequential rollouts, components must wait on each other, as an illustration, the driving force must pause after each inference step until the renderer produces the subsequent perception input. AlpaSim removes this bottleneck: while one scene is rendering, the driving force can run inference for one other scene. This overlap dramatically improves GPU utilization and throughput. Scaling even further, driver inference may be batched across many scenes, while multiple rendering processes generate perception inputs in parallel.


A shared ecosystem
We offer initial implementations for all core services, including rendering via NVIDIA Omniverse NuRec 3DGUT algorithm, a reference controller, and driver baselines. We may also be adding additional driver models, including Alpamayo 1 and CAT-K in the approaching weeks.
The platform also ships initially with roughly 900 reconstructed scenes, each 20 seconds long, and the Physical AI AV Dataset, giving researchers a right away option to evaluate end-to-end models in realistic closed-loop scenarios. As well as, AlpaSim offers extensive configurability, from camera parameters and rendering frequency to artificial latencies and plenty of other simulation settings.
Beyond these built-in components, we see AlpaSim evolving right into a broader collaborative ecosystem. Eventually, labs can seamlessly plug in their very own driving, rendering, or traffic models, and compare approaches directly on shared benchmarks.
AlpaSim in motion
AlpaSim is already powering several of our internal research efforts.
Firstly, in our recently proposed Sim2Val framework, we demonstrated that AlpaSim rollouts are realistic enough to meaningfully improve real-world validation. By incorporating simulated trajectories into our evaluation pipeline, we were in a position to reduce variance in key real-world metrics by as much as 83%, enabling faster and more confident model assessments.
Secondly, we depend on AlpaSim for closed-loop evaluation of our Alpamayo 1 model. By replaying reconstructed scenes and allowing the policy to drive end-to-end, we compute a DrivingScore that reflects performance under realistic traffic conditions.
Beyond evaluation, we’re leveraging AlpaSim for closed-loop training using our concurrently released RoaD algorithm. RoaD effectively mitigates covariate shift between open-loop training and closed-loop deployment while being significantly more data-efficient than traditional reinforcement learning.


Getting began with Alpasim
Start using AlpaSim for your personal model evaluation in only three steps.
Step 1: Access AlpaSim
The open source repository accommodates the obligatory software, with scene reconstruction artifacts available from the NVIDIA Physical AI Open Dataset.
Step 2: Prepare your environment
First, be certain that to follow the onboarding steps in ONBOARDING.md
Then, perform initial setup/installations with the next command:
source setup_local_env.sh
This may compile protos, download an example driver model, download a sample scene from Hugging Face, and install the alpasim_wizard command line tool.
Step 3: Run the simulation
Use the wizard to construct, run, and evaluate a simulation rollout:
alpasim_wizard +deploy=local wizard.log_dir=$PWD/tutorial
The simulation logs/output may be present in the created tutorial directory. For a visualization of the outcomes, an mp4 file is created in tutorial/eval/videos/clipgt-05bb8212..._0.mp4 which can look much like the next.
For more details concerning the output, and far more details about using AlpaSim, please see TUTORIAL.md.
Overall, this instance demonstrates how real world drives may be replayed with an end-to-end policy, including all static and dynamic objects from the unique scene. From this place to begin and the flexible plug-and-play architecture of AlpaSim, users can tweak contender behavior, modify camera parameters, and iterate on policy.
Integrating your policy
Driving policies are easily swappable through generic APIs, allowing developers to check their state-of-the-art implementations.
Step 1: gRPC integration
AlpaSim uses gRPC because the interface between components: a sample implementation of the driving force component may be used as inspiration for conforming to the driver interface.
Step 2: Reconfigure and run
AlpaSim is extremely customizable through yaml file descriptions, including the specification of components utilized by the sim at runtime. Create a brand new configuration file on your model (some examples may be found below)
# driver_configs/my_model.yaml
# @package _global_
services:
driver:
image:
command:
- ""
And run:
alpasim_wizard +deploy=local wizard.log_dir=$PWD/my_model +driver_configs=my_model.yaml
Examples of customization using the CLI:
You’ll be able to change the configuration when running the wizard example:
# Different scene
alpasim_wizard +deploy=local wizard.log_dir=$PWD/custom_run
scenes.scene_ids=['clipgt-02eadd92-02f1-46d8-86fe-a9e338fed0b6']
# More rollouts
alpasim_wizard +deploy=local wizard.log_dir=$PWD/custom_run
runtime.default_scenario_parameters.n_rollouts=8
# Different simulation length
alpasim_wizard +deploy=local wizard.log_dir=$PWD/custom_run
runtime.default_scenario_parameters.n_sim_steps=200
Configuration is managed via Hydra – see src/wizard/configs/base_config.yaml for all available options.
To download the scene referenced above in Figure 9, you possibly can run the next command:
hf download --repo-type=dataset
--local-dir=data/nre-artifacts/all-usdzs
nvidia/PhysicalAI-Autonomous-Vehicles-NuRec
sample_set/25.07_release/Batch0001/02eadd92-02f1-46d8-86fe-a9e338fed0b6/02eadd92-02f1-46d8-86fe-a9e338fed0b6.usdz
Scaling your runs
AlpaSim adapts to suit your hardware configuration through coordination and parallelization of services, efficiently facilitating large test suites, perturbation studies, and training.
alpasim_wizard +deploy=local wizard.log_dir=$PWD/test_suite +experiment=my_test_suite.yaml runtime.default_scenario_parameters.n_rollouts=16
Conclusion: Putting all of it together
The long run of autonomous driving relies on powerful end-to-end models, and AlpaSim provides the aptitude to quickly test and iterate on those models, accelerating research efforts. On this blog we introduced Alpamayo 1 model, the physical AI dataset, and Alpasim Simulator. Together, it provides an entire framework for developing reasoning based AV systems–a model, large amounts of information to coach it, and a simulator for evaluation.
Putting all of it together, below is an example of Alpamayo 1 driving closed-loop through a construction zone inside AlpaSim, demonstrating the model’s reasoning and driving capabilities in addition to AlpaSim’s ability to judge AV models in quite a lot of realistic driving environments.
Joyful coding!
