Advancing Robotics Development with Neural Dynamics in Newton

-


Modern robotics requires greater than what classical analytic dynamics provides due to simplified contacts, omitted kinematic loops, and non-differentiable models. Neural Robot Dynamics (NeRD) tackles these hurdles by: 

  1. Using expressive, differentiable models that predict stable states over long horizons.
  2. Capturing complex contact-rich physics.
  3. Generalizing across tasks, environments, and controllers, narrowing the sim-to-real gap. 
  4. Wonderful-tuning on real data.

Unlike task-specific neural simulators, NeRD serves as a drop-in backend inside physics engines like Newton, enabling teams to reuse existing policy-learning environments by simply switching the physics solver. This hybrid of analytical modules with robot-centric neural modeling paves the best way for robots whose dynamics continually improve through each simulation and real-world experience.

On this post, we explore how NeRD overcomes longstanding simulation challenges, providing the muse for contemporary robotics in physics engines like Newton

What’s NeRD?

NeRD is a neural simulation framework. NeRD models are a learned embodiment of specific dynamics models that may predict future states of articulated rigid bodies (e.g., robots with multiple joints) in touch with the environment. 

Once trained, NeRD models can:

  1. Provide stable and accurate predictions over lots of to 1000’s of simulation steps.
  2. Generalize to different tasks, environments, and low-level controllers for a specific robot. 
  3. Be fine-tuned from real-world data to bridge sim-to-real gaps. 

NeRD models could also be trained on data from any simulator. Once trained, they will be deployed as drop-in replacements for analytic solvers reminiscent of those present in modular frameworks like Newton. This permits users to reuse existing policy-learning environments and activate NeRD as a brand new physics backend through a single-line switch. 

Start using NeRD in Newton. View our research on arXiv or explore our project page.  

Vision for the longer term of robotic simulation

As robotic technologies advance, we envision a lifecycle where each robot is supplied with a neural dynamics model pretrained from analytical simulations. Such a neural dynamics model will be repeatedly fine-tuned because the robot interacts with the actual world, enabling it to account for wear-and-tear of the robot and environmental changes. 

The neural dynamics model of the robot will be embedded right into a hybrid simulation system, where neural dynamics simulate the robot while analytical dynamics are used for other parts of the scene (e.g., obstacles). These continuously-improved neural robot dynamics provide a greater replica of real-world dynamics for facilitating the training of versatile robotic skills in a digital twin powered by this repeatedly updated simulator. 

Flowchart showing how NeRD models are trained, integrated into Newton, fine-tuned with real-world interaction, and used to improve robot skills.Flowchart showing how NeRD models are trained, integrated into Newton, fine-tuned with real-world interaction, and used to improve robot skills.
Figure 1. An envisioned lifecycle of a robot in the longer term

How does neural robot dynamics work?

NeRD is characterised by two key innovations that achieve generalizability and long-horizon prediction accuracy—a hybrid prediction framework and robot-centric input parameterization. NeRD models replace the time integration (solver) portion of a standard simulator. In frameworks like Newton, where collision detection is decoupled from the solver, we are able to mix analytic collision detection along side our learned model. 

This hybrid framework enables NeRD to leverage intermediate simulation quantities (i.e., robot state, contact information, and joint-space torques) to explain the complete simulation state, providing needed information to evolve the robot dynamics whatever the applications (e.g., tasks, scenes, and controllers). That is in contrast with previous approaches that only take robot state and task-specific actions as inputs, thus overfitting to the tasks used for training. 

Second, NeRD uses a robot-centric parameterization of inputs to enable the learned dynamics model to spatially generalize. Specifically, the robot state and contact-related quantities are transformed into the robot’s base frame before they’re passed as input to the NeRD model, as shown in Figure 2(c). 

Such a robot-centric state representation enables NeRD to perform reliable predictions at unseen spatial robot locations encountered during robot motion, enhancing the long-horizon accuracy of the model.

Diagram showing NeRD’s hybrid prediction framework, where NeRD replaces the physics solver in a simulator and uses inputs like robot state, contact data, and joint torques in the robot base frame.Diagram showing NeRD’s hybrid prediction framework, where NeRD replaces the physics solver in a simulator and uses inputs like robot state, contact data, and joint torques in the robot base frame.
Figure 2. Overview of the workflow for a classical robotics simulator and the NeRD framework

Training dataset and network architecture

The training datasets for NeRD are generated in a task-agnostic manner using data from a simulator. For every robot instance, we collect 100K random trajectories, each consisting of 100 timesteps. These trajectories were generated using randomized initial states of the robot, random joint-torque sequences throughout the robot’s motor torque limits, and optionally, randomized environment configurations (shown in Figure 3). We model NeRD using a causal transformer architecture, specifically a light-weight implementation of the GPT-2 transformer, where the model takes the simulation states from essentially the most recent 10 steps as input.

Should you’d prefer to use NeRD, take a look at our open source code available on GitHub

Once a model is trained, we integrate it right into a modular physics engine reminiscent of Newton. It serves as an interchangeable solver for the simulator, replacing the prevailing analytical dynamics and call solvers. Developers can then use this NeRD-integrated simulator the identical way they’ve before and reuse existing policy-learning environments.

Figure 3. Dataset used for training a NeRD model for an ANYmal quadruped robot 

Alt text: We use randomly-generated trajectories to coach a NeRD model.

What are the advantages of coaching robots with NeRD?

Training robots with NeRD enables highly stable, accurate, and generalizable simulation, accelerating policy learning and bridging the sim-to-real gap for reliable real-world deployment.

Stability and accuracy

The trained NeRD model can accurately predict the dynamics of a chaotic system, reminiscent of a double pendulum, over 100 time steps. A single NeRD model can also be able to simulating different contact configurations (e.g., different heights and orientations of the bottom plane). Figure 4 shows a side-by-side comparison of the NeRD-integrated simulator and a ground-truth analytical simulator with a Featherstone solver.

A single trained NeRD model can predict 100-step dynamics of a chaotic double pendulum system with various configurations of the ground plane, with high accuracy.A single trained NeRD model can predict 100-step dynamics of a chaotic double pendulum system with various configurations of the ground plane, with high accuracy.
Figure 4. Comparison of the analytical simulator and NeRD on a double pendulum with various configurations of the bottom plane

Learning robotic policies exclusively in a NeRD-integrated simulator

NeRD’s efficiency and generalizability across tasks, controllers, and space enable large-scale robotic policy learning for diverse downstream tasks. We pre-train a NeRD model for an ANYmal robot after which train a forward-walking policy and a sideways-walking policy using the PPO reinforcement-learning algorithm contained in the NeRD-integrated simulator, without access to the ground-truth analytical simulator. 

The learned policies can then be transferred in zero-shot to the ground-truth analytical simulator with minimal performance loss (<0.1% error in gathered reward for 1000-step trajectories). Figures 5 and 6 show a side-by-side visualization of NeRD-trained policies executed in each the NeRD-integrated simulator and the ground-truth analytical simulator.

ANYmal robot walking with policies trained in a NeRD-integrated simulator using PPO, showing behavior that matches the analytical simulator.ANYmal robot walking with policies trained in a NeRD-integrated simulator using PPO, showing behavior that matches the analytical simulator.
Figure 5. Comparison of an analytical simulator and a NeRD model for an ANYmal robot with an RL policy for forward walking at 1 m/s
ANYmal robot performing sideways walking with PPO-trained policies in a NeRD-integrated simulator, showing behavior consistent with the analytical simulator.ANYmal robot performing sideways walking with PPO-trained policies in a NeRD-integrated simulator, showing behavior consistent with the analytical simulator.
Figure 6. Comparison of an analytical simulator and a NeRD model for an ANYmal robot with an RL policy for sideways walking at 1 m/s

Zero-shot sim-to-real transfer

The accuracy of the NeRD model was also validated on a 7-DoF Franka robot arm, where we performed zero-shot sim-to-real transfer for a go-to-pose (reach) policy trained exclusively within the NeRD-integrated simulator (Figure 7).

Franka arm trained in a NeRD-integrated simulator on a go-to-pose policy, then deployed in the real world with high accuracy.Franka arm trained in a NeRD-integrated simulator on a go-to-pose policy, then deployed in the real world with high accuracy.
Figure 7. Zero-shot sim-to-real transfer of a go-to-pose policy trained exclusively in a NeRD-integrated simulator

Wonderful-tuning NeRD models from real-world data

Inherent differentiability of the NeRD models enables them to be fine-tuned rapidly from real-world data. We fine-tune a pre-trained NeRD model for a cube-tossing task using a real-world cube-tossing dataset. The fine-tuned NeRD model significantly improves ‌dynamics accuracy in comparison with the analytical simulator (shown in Figure 8)

Cube tossing experiment where a NeRD model, fine-tuned with real-world data, improves simulator accuracy to better match real-world dynamics.Cube tossing experiment where a NeRD model, fine-tuned with real-world data, improves simulator accuracy to better match real-world dynamics.
Figure 8. Wonderful-tuning a NeRD model on real-world data higher matches real-world cube-tossing dynamics. The sunshine-green cubic frames illustrate the real-world cube trajectory

Summary

Neural Robot Dynamics (NeRD) is a neural-network-based robotic simulation framework designed to accurately predict the dynamics of complex, articulated robots over long periods. Unlike traditional robotic simulators that use simplified models and struggle with modern robot complexities, NeRD learns robot-specific dynamics directly from data, enabling stable, generalizable, and precise simulations. 

A single trained NeRD model generalizes to diverse tasks, environments, and controllers for a given robot and will be fine-tuned with real-world data to cut back the simulation-to-reality gap, making it a highly adaptable and advanced solution for robotic simulation.

Future directions

Developing effective neural simulators for modeling complex real-world robot dynamics is an lively area of research. To attain generalizable and fine-tunable neural dynamics models for robotics, this research will be prolonged in several exciting directions:

Robots with more complex structures and better degrees of freedom 

Learning a neural simulator for more complicated robots (e.g., humanoid robots) can significantly improve simulation efficiency and speed up downstream applications (e.g., learning whole-body controllers for humanoids).

Wonderful-tuning from partially-observable real-world data

Real-world robot data is commonly only partially observable because of sensor limitations. For instance, contact points will not be precisely known. Investigating methods to fine-tune pre-trained NeRD models from partially observable real-world data can improve the accuracy of predicting real-world dynamics, thereby higher bridging sim-to-real gaps.

Simulating robotic manipulation

Our development of the NeRD framework has so far focused totally on locomotion tasks. Supporting the simulation of manipulation tasks is a natural extension of this work that may further broaden its applicability.

Start using NeRD

The NeRD models are trained using the simulation module available in Newton. View the GitHub README.md for instructions on methods to use NeRD.

  • Start by downloading Newton, an open-source, extensible physics engine to write down GPU-accelerated, kernel-based programs for simulation, AI, robotics, and machine learning.
  • Download NeRD’s open source code and look at the README for instructions.
  • Learn more about the main points of Neural Robot Dynamics from the NeRD paper on arXiv.

Stay tuned for the discharge of the training and inference code for NeRD, enabling you to simulate a dynamic robot using a neural physics solver.

Learn more concerning the research being showcased at CoRL and Humanoids, happening September 27-October 2 in Seoul, Korea.



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x