MIT researchers have developed a brand new theoretical framework for studying the mechanisms of treatment interactions. Their approach allows scientists to efficiently estimate how mixtures of treatments will affect a bunch of units, reminiscent of cells, enabling a researcher to perform fewer costly experiments while gathering more accurate data.
For example, to review how interconnected genes affect cancer cell growth, a biologist might need to make use of a mix of treatments to focus on multiple genes without delay. But because there may very well be billions of potential mixtures for every round of the experiment, selecting a subset of mixtures to check might bias the info their experiment generates.
In contrast, the brand new framework considers the scenario where the user can efficiently design an unbiased experiment by assigning all treatments in parallel, and might control the final result by adjusting the speed of every treatment.
The MIT researchers theoretically proved a near-optimal strategy on this framework and performed a series of simulations to check it in a multiround experiment. Their method minimized the error rate in each instance.
This system could someday help scientists higher understand disease mechanisms and develop recent medicines to treat cancer or genetic disorders.
“We’ve introduced an idea people can think more about as they study the optimal strategy to select combinatorial treatments at each round of an experiment. Our hope is this may someday be used to resolve biologically relevant questions,” says graduate student Jiaqi Zhang, an Eric and Wendy Schmidt Center Fellow and co-lead writer of a paper on this experimental design framework.
She is joined on the paper by co-lead writer Divya Shyamal, an MIT undergraduate; and senior writer Caroline Uhler, the Andrew and Erna Viterbi Professor of Engineering in EECS and the MIT Institute for Data, Systems, and Society (IDSS), who can also be director of the Eric and Wendy Schmidt Center and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS). The research was recently presented on the International Conference on Machine Learning.
Simultaneous treatments
Treatments can interact with one another in complex ways. As an example, a scientist trying to find out whether a certain gene contributes to a selected disease symptom can have to focus on several genes concurrently to review the consequences.
To do that, scientists use what are often known as combinatorial perturbations, where they apply multiple treatments without delay to the identical group of cells.
“Combinatorial perturbations offers you a high-level network of how different genes interact, which provides an understanding of how a cell functions,” Zhang explains.
Since genetic experiments are costly and time-consuming, the scientist goals to pick out the very best subset of treatment mixtures to check, which is a steep challenge attributable to the large variety of possibilities.
Picking a suboptimal subset can generate biased results by focusing only on mixtures the user chosen upfront.
The MIT researchers approached this problem in a different way by taking a look at a probabilistic framework. As a substitute of specializing in a particular subset, each unit randomly takes up mixtures of treatments based on user-specified dosage levels for every treatment.
The user sets dosage levels based on the goal of their experiment — perhaps this scientist wants to review the consequences of 4 different drugs on cell growth. The probabilistic approach generates less biased data since it doesn’t restrict the experiment to a predetermined subset of treatments.
The dosage levels are like probabilities, and every cell receives a random combination of treatments. If the user sets a high dosage, it’s more likely many of the cells will take up that treatment. A smaller subset of cells will take up that treatment if the dosage is low.
“From there, the query is how will we design the dosages in order that we are able to estimate the outcomes as accurately as possible? That is where our theory is available in,” Shyamal adds.
Their theoretical framework shows the very best strategy to design these dosages so one can learn probably the most in regards to the characteristic or trait they’re studying.
After each round of the experiment, the user collects the outcomes and feeds those back into the experimental framework. It’ll output the best dosage strategy for the subsequent round, and so forth, actively adapting the strategy over multiple rounds.
Optimizing dosages, minimizing error
The researchers proved their theoretical approach generates optimal dosages, even when the dosage levels are affected by a limited supply of treatments or when noise within the experimental outcomes varies at each round.
In simulations, this recent approach had the bottom error rate when comparing estimated and actual outcomes of multiround experiments, outperforming two baseline methods.
In the long run, the researchers want to reinforce their experimental framework to contemplate interference between units and the incontrovertible fact that certain treatments can result in selection bias. They’d also wish to apply this method in an actual experimental setting.
“This can be a recent approach to a really interesting problem that is difficult to resolve. Now, with this recent framework in hand, we are able to think more about the very best strategy to design experiments for many alternative applications,” Zhang says.
This research is funded, partially, by the Advanced Undergraduate Research Opportunities Program at MIT, Apple, the National Institutes of Health, the Office of Naval Research, the Department of Energy, the Eric and Wendy Schmidt Center on the Broad Institute, and a Simons Investigator Award.