Generalist robot policies must operate across diverse tasks, embodiments, and environments, requiring scalable, repeatable simulation-based evaluation. Establishing large-scale policy evaluations is tedious and manual. And not using a systematic approach, developers need to construct high-overhead custom infrastructure, yet task libraries remain limited in complexity and variety.
This post introduces NVIDIA Isaac Lab-Arena, an open source framework for efficient and scalable robotic policy evaluation in simulation. Co-developed with Lightwheel as an extension to NVIDIA Isaac Lab, it provides streamlined APIs for task curation, diversification, and large-scale parallel evaluation. Developers can now prototype complex benchmarks without the overhead of system constructing. The post also presents an end-to-end sample workflow covering environment setup, optional policy post-training, and closed-loop evaluation.
Overview and key advantages of Isaac Lab-Arena
We’re announcing the pre-alpha release of Isaac Lab-Arena and alluring the community to assist shape its road map. We’re also partnering with benchmark authors to implement and open source their evaluations on Isaac Lab-Arena, enabling a growing ecosystem of ready-to-use benchmarks and shared evaluation methods on a unified core.
The important thing advantages of Isaac Lab-Arena include simplified task curation, automated diversification, large-scale benchmarking, seamless integration with data generation and training, and more, as detailed below.
- Simplified task curation (0 to 1):
- Modular: Replaces monolithic task descriptions with a Lego-like architecture, compiling Isaac Lab environments on-the-fly from independent Object, Scene, Embodiment, and Task blocks.Â
- Generalizable: Standardized Interactions through an Affordance system (for instance Openable, Pressable) enables tasks to scale across diverse objects.Â
- Extensible: Metrics and data recorded are extensible, providing users with fine-grained control over simulation and analytics if needed.
- Automated diversification (1 to many): Easily mix and match components, applying one task across different robots or objects—similar to switching from a domestic soda can to an industrial pipe task—without rewriting code. In the long run, the team goals to leverage foundation models to automate generation of diverse and realistic tasks.
- Large-scale parallel, policy-agnostic benchmarking: Evaluate any robotic policy across hundreds of parallel environments for high-throughput, GPU-accelerated evaluations. The present version supports homogeneous parallel environments (with parameter variations).
- Access to community benchmarks and shared evaluation methods on a unified core.
- Open source with business license: Developers can freely use, distribute, and contribute to framework development.
- Seamless integration with data generation and training: While the core function of Isaac Lab-Arena is task setup and evaluation, it integrates tightly with data generation and training frameworks for a seamless closed-loop workflow. This includes Isaac Lab-Teleop, Isaac Lab-Mimic, and post-training and inference of NVIDIA Isaac GR00T N models.
- Flexible deployment: Deploy on local workstations or cloud-native environments (similar to OSMO) for CI/CD, or integrate into leaderboards and distribution platforms similar to LeRobot Environment Hub.


Ecosystem development
NVIDIA is partnering with benchmark authors to construct their evaluations on Isaac Lab-Arena and publish sim-to-real validated evaluation methods, tasks, and datasets that the community can reuse and extend on a unified core. Coverage will span each industrial and research benchmarks across mobility, manipulation, and loco-manipulation.
Lightwheel co-developed and has adopted the Isaac Lab-Arena framework to create and open-source 250+ tasks through the Lightwheel-RoboCasa-Tasks and Lightwheel-LIBERO-Tasks suites, with future efforts to ascertain them as benchmarks. Lightwheel can also be developing RoboFinals, an industrial benchmark representative of complex real-world environments, using Isaac Lab Arena.Â


Isaac Lab-Arena environments are actually integrated on the Hugging Face LeRobot Environment Hub, where developers can seamlessly register custom environments built on IsaacLab-Arena and use the growing library of environments to post-train and evaluate robotic policies including Isaac GR00T N, pi0, SmolVLA. For more details, visit the LeRobot documentation.
NVIDIA is enabling hundreds of thousands of developers with open robotics models and datasets on Hugging Face, contributing to robotics becoming the fastest growing category on the platform.
RoboTwin is using Isaac Lab-Arena to construct prolonged versions of RoboTwin 2.0, a large-scale embodied simulation benchmark, and other complex long-horizon benchmarks. An open source release is planned, with energetic development underway on research submissions and code updates.
As well as, NVIDIA Research labs similar to the Generalist Embodied Agent Research Lab (GEAR) Lab is leveraging Isaac Lab-Arena to benchmark the Isaac GR00T N family of vision language motion models for generalized humanoid reasoning and skills at scale.
NVIDIA Seattle Robotics Lab (SRL) is integrating its research on language-conditioned task suites and evaluation methods for the benchmarking of generalist robot policies into Isaac Lab-Arena.
Future Isaac Lab-Arena enhancementsÂ
The present pre-alpha release is intentionally an early framework skeleton with limited features giving contributors a practical start line to experiment, share feedback, and influence future design and direction.Â
Within the near future, core capabilities essential to constructing complex task libraries will probably be added, including object placement through natural language, composite tasking by chaining atomic skills, reinforcement learning task setup, and parallel heterogeneous evaluations (for instance, different objects per parallel environment).Â
Further out, the team goals to explore more agentic and neural approaches to scale evaluation. Examples include leveraging NVIDIA Cosmos for world-model-driven neural simulation and scenario generation, in addition to NVIDIA Omniverse NuRec for real-to-sim construction of simulation environments that mirror the actual world. Community participation and feedback will probably be vital to shaping these developments.Â
Find out how to arrange tasks and evaluate policies at scale using Isaac Lab-Arena
This section presents an end-to-end sample workflow to judge an Isaac GR00T N model on a manipulation skill—opening a microwave door—with the GR1 robot in Isaac Lab-Arena. It covers environment setup, optional policy post-training, and closed-loop evaluation.


Step 1: Environment creation and diversification
Follow the GR1 open microwave door task prerequisites to clone the repo and run the Docker container. Then, create an environment in Isaac Lab-Arena by stitching together Objects (Microwave) with Affordances (Openable, Pressable), within the Scene (Kitchen) with an Embodiment (GR-1 Robot) to perform a Task (OpenDoor). Users can optionally include configuration for Teleoperation-based data collection.Â
Procure assets:
background = self.asset_registry.get_asset_by_name("kitchen")()
microwave = self.asset_registry.get_asset_by_name("microwave")()
assets = [background, microwave]
embodiment = self.asset_registry.get_asset_by_name("gr1_pink")(enable_cameras=args_cli.enable_cameras)
teleop_device = self.device_registry.get_device_by_name("avp")()
For more details, see Assets Design and Affordances Design.
Position objects:
microwave_pose = Pose(
position_xyz=(0.4, -0.00586, 0.22773),
rotation_wxyz=(0.7071068, 0, 0, -0.7071068),
)
microwave.set_initial_pose(microwave_pose)
Compose the scene:
scene = Scene(assets=assets)
Create the duty:
task = OpenDoorTask(microwave, openness_threshold=0.8, reset_openness=0.2)
Tasks encapsulate objectives, success criteria, together with termination logic, events and metrics. To learn more, see Task Design.Â
Finally, assemble all of the pieces into an entire, runnable environment:
isaaclab_arena_environment = IsaacLabArenaEnvironment(
name=self.name,
embodiment=embodiment,
scene=scene,
task=task,
teleop_device=teleop_device,
)
Next, run the environment using a test dataset.Â
Download a test dataset:
hf download
nvidia/Arena-GR1-Manipulation-Task
arena_gr1_manipulation_dataset_generated.hdf5
--repo-type dataset
--local-dir $DATASET_DIR
Run the environment:
python isaaclab_arena/scripts/replay_demos.py
--device cpu
--enable_cameras
--dataset_file "${DATASET_DIR}/arena_gr1_manipulation_dataset_generated.hdf5"
gr1_open_microwave
--embodiment gr1_pink
The robot will replace NVIDIA-collected teleoperation data as a way to open the microwave.
For comprehensive technical details and design principles to create latest environments, seek the advice of the tutorial documentation.
Scale a task efficiently across robots, objects, and scene
This section provides several examples that show easy methods to easily modify objects or robots in a task—without rebuilding the environment or pipeline.Â
Example 1 – Change the item from microwave to power_drill:
background = asset_registry.get_asset_by_name("kitchen")()
embodiment = asset_registry.get_asset_by_name("gr1_pink")()
power_drill = asset_registry.get_asset_by_name("power_drill")()
assets = [background, power_drill]


Example 2 – Change the embodiment from GR1 to Franka arm and the item to cracker_box:
background = asset_registry.get_asset_by_name("kitchen")()
embodiment = asset_registry.get_asset_by_name("franka")()
cracker_box = asset_registry.get_asset_by_name("cracker_box")()
assets = [background, cracker_box]


Example 3 – Change the background from a kitchen to an industrial packing table:
background = asset_registry.get_asset_by_name("packing_table")()
embodiment = asset_registry.get_asset_by_name("gr1_pink")()
cracker_box = asset_registry.get_asset_by_name("power_drill")()
assets = [background, cracker_box]


Step 2: Optional policy post-training
While Isaac Lab-Arena at its core focuses on task setup and policy evaluation, the Isaac Lab-Arena environment can seamlessly interoperate with data collection, data generation, and post-training in case your policy must be post-trained prior to evaluation. You possibly can:
Step 3: Execute evaluations on parallel environmentsÂ
The subsequent step is to judge the trained policy. It’s important to notice that you would be able to evaluate any trained robotic policy with the framework.
Option 1 – Test the policy in a single environment:
python isaaclab_arena/examples/policy_runner.py
--policy_type gr00t_closedloop
--policy_config_yaml_path isaaclab_arena_gr00t/gr1_manip_gr00t_closedloop_config.yaml
--num_steps 2000
--enable_cameras
gr1_open_microwave
--embodiment gr1_joint
Option 2 – Test the policy in multiple parallel homogenous environments:
python isaaclab_arena/examples/policy_runner.py
--policy_type gr00t_closedloop
--policy_config_yaml_path isaaclab_arena_gr00t/gr1_manip_gr00t_closedloop_config.yaml
--num_steps 2000
--num_envs 10
--enable_cameras
gr1_open_microwave
--embodiment gr1_joint
Start with NVIDIA Isaac Lab-Arena
Isaac Lab-Arena pre-alpha is open source, and we invite you to assist guide its future design and development. To start with Isaac Lab-Arena pre-alpha, visit the GitHub repo and documentation.
- Share feedback by opening GitHub issues to report bugs or suggest feature and design improvements, and contribute by opening pull requests to propose changes.Â
- Create tasks or sim-to-real validated benchmarks on Isaac Lab-Arena and open source them to assist construct a shared ecosystem of ready‑to‑use robot learning tasks.
- Publish tasks to a leaderboard or evaluation hub just like the LeRobot Environment Hub to make them discoverable and straightforward to run across shared pipelines and registries.
Not sleep thus far by subscribing to our newsletter and following NVIDIA Robotics on LinkedIn, Instagram, X, and Facebook. Explore NVIDIA documentation and YouTube channels, and join the NVIDIA Developer Robotics forum. To begin your robotics journey, enroll in our free NVIDIA Robotics Fundamentals courses today.
Start with NVIDIA Isaac libraries and AI models for developing physical AI systems.
Watch NVIDIA Live at CES to learn more.
