R²D²: Improving Robot Manipulation with Simulation and Language Models

Robot manipulation systems struggle with changing objects, lighting, and get in touch with dynamics after they move into dynamic real-world environments. On top of this, gaps between simulation and reality, and non-optimized grippers or tools often limit how reliably robots can generalize, execute long-horizon tasks, and achieve human-level dexterity across diverse tasks.

This edition of NVIDIA Robotics Research and Development Digest (R²D²) explores novel approaches to improving robot manipulation skills. On this blog, we’ll discuss three research efforts that use reasoning LLMs, sim-and-real co-training, and VLMs for designing tools for manipulation:

We’ll also cover how robot manipulation could be improved using data augmentation and other recipes from the Cosmos Cookbook. This cookbook is an open-source resource that features examples of real-world applications of NVIDIA Cosmos for robotics and autonomous driving.