Home Artificial Intelligence Researchers create a tool for accurately simulating complex systems

Researchers create a tool for accurately simulating complex systems

2
Researchers create a tool for accurately simulating complex systems

Researchers often use simulations when designing latest algorithms, since testing ideas in the true world may be each costly and dangerous. But because it’s not possible to capture every detail of a fancy system in a simulation, they typically collect a small amount of real data that they replay while simulating the components they need to review.

Often known as trace-driven simulation (the small pieces of real data are called traces), this method sometimes leads to biased outcomes. This implies researchers might unknowingly select an algorithm that is just not the most effective one they evaluated, and which can perform worse on real data than the simulation predicted that it should.

MIT researchers have developed a latest method that eliminates this source of bias in trace-driven simulation. By enabling unbiased trace-driven simulations, the brand new technique could help researchers design higher algorithms for a wide range of applications, including improving video quality on the web and increasing the performance of knowledge processing systems.

The researchers’ machine-learning algorithm draws on the principles of causality to find out how the info traces were affected by the behavior of the system. In this manner, they’ll replay the right, unbiased version of the trace through the simulation.

In comparison to a previously developed trace-driven simulator, the researchers’ simulation method accurately predicted which newly designed algorithm could be best for video streaming — meaning the one which led to less rebuffering and better visual quality. Existing simulators that don’t account for bias would have pointed researchers to a worse-performing algorithm.

“Data should not the one thing that matter. The story behind how the info are generated and picked up can be vital. If you wish to answer a counterfactual query, it’s essential know the underlying data generation story so you simply intervene on those things that you just actually need to simulate,” says Arash Nasr-Esfahany, an electrical engineering and computer science (EECS) graduate student and co-lead creator of a paper on this latest technique.

He’s joined on the paper by co-lead authors and fellow EECS graduate students Abdullah Alomar and Pouya Hamadanian; recent graduate student Anish Agarwal PhD ’21; and senior authors Mohammad Alizadeh, an associate professor of electrical engineering and computer science; and Devavrat Shah, the Andrew and Erna Viterbi Professor in EECS and a member of the Institute for Data, Systems, and Society and of the Laboratory for Information and Decision Systems. The research was recently presented on the USENIX Symposium on Networked Systems Design and Implementation.

Specious simulations

The MIT researchers studied trace-driven simulation within the context of video streaming applications.

In video streaming, an adaptive bitrate algorithm continually decides the video quality, or bitrate, to transfer to a tool based on real-time data on the user’s bandwidth. To check how different adaptive bitrate algorithms impact network performance, researchers can collect real data from users during a video stream for a trace-driven simulation.

They use these traces to simulate what would have happened to network performance had the platform used a special adaptive bitrate algorithm in the identical underlying conditions.

Researchers have traditionally assumed that trace data are exogenous, meaning they aren’t affected by aspects which are modified through the simulation. They might assume that, through the period once they collected the network performance data, the alternatives the bitrate adaptation algorithm made didn’t affect those data.

But this is usually a false assumption that leads to biases in regards to the behavior of latest algorithms, making the simulation invalid, Alizadeh explains.

“We recognized, and others have recognized, that this manner of doing simulation can induce errors. But I don’t think people necessarily knew how significant those errors could possibly be,” he says.

To develop an answer, Alizadeh and his collaborators framed the problem as a causal inference problem. To gather an unbiased trace, one must understand the various causes that affect the observed data. Some causes are intrinsic to a system, while others are affected by the actions being taken.

Within the video streaming example, network performance isaffected by the alternatives the bitrate adaptation algorithm made — however it’s also affected by intrinsic elements, like network capability.

“Our task is to disentangle these two effects, to try to grasp what features of the behavior we’re seeing are intrinsic to the system and the way much of what we’re observing relies on the actions that were taken. If we will disentangle these two effects, then we will do unbiased simulations,” he says.

Learning from data

But researchers often cannot directly observe intrinsic properties. That is where the brand new tool, called CausalSim, is available in. The algorithm can learn the underlying characteristics of a system using only the trace data.

CausalSim takes trace data that were collected through a randomized control trial, and estimates the underlying functions that produced those data. The model tells the researchers, under the very same underlying conditions that a user experienced, how a latest algorithm would change the consequence.

Using a typical trace-driven simulator, bias might lead a researcher to pick a worse-performing algorithm, despite the fact that the simulation indicates it must be higher. CausalSim helps researchers select the most effective algorithm that was tested.

The MIT researchers observed this in practice. After they used CausalSim to design an improved bitrate adaptation algorithm, it led them to pick a latest variant that had a stall rate that was nearly 1.4 times lower than a well-accepted competing algorithm, while achieving the identical video quality. The stall rate is the period of time a user spent rebuffering the video.

In contrast, an expert-designed trace-driven simulator predicted the other. It indicated that this latest variant should cause a stall rate that was nearly 1.3 times higher. The researchers tested the algorithm on real-world video streaming and confirmed that CausalSim was correct.

“The gains we were getting in the brand new variant were very near CausalSim’s prediction, while the expert simulator was way off. This is absolutely exciting because this expert-designed simulator has been utilized in research for the past decade. If CausalSim can so clearly be higher than this, who knows what we will do with it?” says Hamadanian.

During a 10-month experiment, CausalSim consistently improved simulation accuracy, leading to algorithms that made about half as many errors as those designed using baseline methods.

In the long run, the researchers need to apply CausalSim to situations where randomized control trial data should not available or where it is very difficult to get better the causal dynamics of the system. In addition they need to explore find out how to design and monitor systems to make them more amenable to causal evaluation.

2 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here