Home Artificial Intelligence A less complicated method for learning to regulate a robot

A less complicated method for learning to regulate a robot

3
A less complicated method for learning to regulate a robot

Researchers from MIT and Stanford University have devised a latest machine-learning approach that could possibly be used to regulate a robot, similar to a drone or autonomous vehicle, more effectively and efficiently in dynamic environments where conditions can change rapidly.

This system could help an autonomous vehicle learn to compensate for slippery road conditions to avoid going right into a skid, allow a robotic free-flyer to tow different objects in space, or enable a drone to closely follow a downhill skier despite being buffeted by strong winds.

The researchers’ approach incorporates certain structure from control theory into the method for learning a model in such a way that results in an efficient approach to controlling complex dynamics, similar to those attributable to impacts of wind on the trajectory of a flying vehicle. One solution to take into consideration this structure is as a touch that may also help guide tips on how to control a system.

“The main target of our work is to learn intrinsic structure within the dynamics of the system that could be leveraged to design simpler, stabilizing controllers,” says Navid Azizan, the Esther and Harold E. Edgerton Assistant Professor within the MIT Department of Mechanical Engineering and the Institute for Data, Systems, and Society (IDSS), and a member of the Laboratory for Information and Decision Systems (LIDS). “By jointly learning the system’s dynamics and these unique control-oriented structures from data, we’re capable of naturally create controllers that function rather more effectively in the true world.”

Using this structure in a learned model, the researchers’ technique immediately extracts an efficient controller from the model, versus other machine-learning methods that require a controller to be derived or learned individually with additional steps. With this structure, their approach can be capable of learn an efficient controller using fewer data than other approaches. This might help their learning-based control system achieve higher performance faster in rapidly changing environments.

“This work tries to strike a balance between identifying structure in your system and just learning a model from data,” says lead creator Spencer M. Richards, a graduate student at Stanford University. “Our approach is inspired by how roboticists use physics to derive simpler models for robots. Physical evaluation of those models often yields a useful structure for the needs of control — one that you just might miss in case you just tried to naively fit a model to data. As a substitute, we attempt to discover similarly useful structure from data that indicates tips on how to implement your control logic.”

Additional authors of the paper are Jean-Jacques Slotine, professor of mechanical engineering and of brain and cognitive sciences at MIT, and Marco Pavone, associate professor of aeronautics and astronautics at Stanford. The research shall be presented on the International Conference on Machine Learning (ICML).

Learning a controller

Determining the very best solution to control a robot to perform a given task is usually a difficult problem, even when researchers know tips on how to model all the things concerning the system.

A controller is the logic that allows a drone to follow a desired trajectory, for instance. This controller would tell the drone tips on how to adjust its rotor forces to compensate for the effect of winds that may knock it off a stable path to achieve its goal.

This drone is a dynamical system — a physical system that evolves over time. On this case, its position and velocity change because it flies through the environment. If such a system is straightforward enough, engineers can derive a controller by hand. 

Modeling a system by hand intrinsically captures a certain structure based on the physics of the system. For example, if a robot were modeled manually using differential equations, these would capture the connection between velocity, acceleration, and force. Acceleration is the speed of change in velocity over time, which is set by the mass of and forces applied to the robot.

But often the system is simply too complex to be exactly modeled by hand. Aerodynamic effects, like the way in which swirling wind pushes a flying vehicle, are notoriously difficult to derive manually, Richards explains. Researchers would as a substitute take measurements of the drone’s position, velocity, and rotor speeds over time, and use machine learning to suit a model of this dynamical system to the information. But these approaches typically don’t learn a control-based structure. This structure is helpful in determining tips on how to best set the rotor speeds to direct the motion of the drone over time.

Once they’ve modeled the dynamical system, many existing approaches also use data to learn a separate controller for the system.

“Other approaches that attempt to learn dynamics and a controller from data as separate entities are a bit detached philosophically from the way in which we normally do it for less complicated systems. Our approach is more harking back to deriving models by hand from physics and linking that to regulate,” Richards says.

Identifying structure

The team from MIT and Stanford developed a method that uses machine learning to learn the dynamics model, but in such a way that the model has some prescribed structure that is helpful for controlling the system.

With this structure, they’ll extract a controller directly from the dynamics model, somewhat than using data to learn a wholly separate model for the controller.

“We found that beyond learning the dynamics, it’s also essential to learn the control-oriented structure that supports effective controller design. Our approach of learning state-dependent coefficient factorizations of the dynamics has outperformed the baselines when it comes to data efficiency and tracking capability, proving to achieve success in efficiently and effectively controlling the system’s trajectory,” Azizan says. 

After they tested this approach, their controller closely followed desired trajectories, outpacing all of the baseline methods. The controller extracted from their learned model nearly matched the performance of a ground-truth controller, which is built using the precise dynamics of the system.

“By making simpler assumptions, we got something that truly worked higher than other complicated baseline approaches,” Richards adds.

The researchers also found that their method was data-efficient, which implies it achieved high performance even with few data. For example, it could effectively model a highly dynamic rotor-driven vehicle using only 100 data points. Methods that used multiple learned components saw their performance drop much faster with smaller datasets.

This efficiency could make their technique especially useful in situations where a drone or robot must learn quickly in rapidly changing conditions.

Plus, their approach is general and could possibly be applied to many varieties of dynamical systems, from robotic arms to free-flying spacecraft operating in low-gravity environments.

In the longer term, the researchers are keen on developing models which might be more physically interpretable, and that will have the ability to discover very specific details about a dynamical system, Richards says. This could lead on to better-performing controllers.

“Despite its ubiquity and importance, nonlinear feedback control stays an art, making it especially suitable for data-driven and learning-based methods. This paper makes a major contribution to this area by proposing a technique that jointly learns system dynamics, a controller, and control-oriented structure,” says Nikolai Matni, an assistant professor within the Department of Electrical and Systems Engineering on the University of Pennsylvania, who was not involved with this work. “What I discovered particularly exciting and compelling was the combination of those components right into a joint learning algorithm, such that control-oriented structure acts as an inductive bias in the training process. The result’s a data-efficient learning process that outputs dynamic models that enjoy intrinsic structure that allows effective, stable, and robust control. While the technical contributions of the paper are excellent themselves, it is that this conceptual contribution that I view as most enjoyable and significant.”

This research is supported, partially, by the NASA University Leadership Initiative and the Natural Sciences and Engineering Research Council of Canada.

3 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here