Home Artificial Intelligence Efficient technique improves machine-learning models’ reliability

Efficient technique improves machine-learning models’ reliability

2
Efficient technique improves machine-learning models’ reliability

Powerful machine-learning models are getting used to assist people tackle tough problems akin to identifying disease in medical images or detecting road obstacles for autonomous vehicles. But machine-learning models could make mistakes, so in high-stakes settings it’s critical that humans know when to trust a model’s predictions.

Uncertainty quantification is one tool that improves a model’s reliability; the model produces a rating together with the prediction that expresses a confidence level that the prediction is correct. While uncertainty quantification could be useful, existing methods typically require retraining the complete model to provide it that ability. Training involves showing a model hundreds of thousands of examples so it may well learn a task. Retraining then requires hundreds of thousands of recent data inputs, which could be expensive and difficult to acquire, and in addition uses huge amounts of computing resources.

Researchers at MIT and the MIT-IBM Watson AI Lab have now developed a method that allows a model to perform simpler uncertainty quantification, while using far fewer computing resources than other methods, and no additional data. Their technique, which doesn’t require a user to retrain or modify a model, is flexible enough for a lot of applications.

The technique involves creating an easier companion model that assists the unique machine-learning model in estimating uncertainty. This smaller model is designed to discover several types of uncertainty, which can assist researchers drill down on the basis explanation for inaccurate predictions.

“Uncertainty quantification is important for each developers and users of machine-learning models. Developers can utilize uncertainty measurements to assist develop more robust models, while for users, it may well add one other layer of trust and reliability when deploying models in the true world. Our work results in a more flexible and practical solution for uncertainty quantification,” says Maohao Shen, an electrical engineering and computer science graduate student and lead creator of a paper on this method.

Shen wrote the paper with Yuheng Bu, a former postdoc within the Research Laboratory of Electronics (RLE) who’s now an assistant professor on the University of Florida; Prasanna Sattigeri, Soumya Ghosh, and Subhro Das, research staff members on the MIT-IBM Watson AI Lab; and senior creator Gregory Wornell, the Sumitomo Professor in Engineering who leads the Signals, Information, and Algorithms Laboratory RLE and is a member of the MIT-IBM Watson AI Lab. The research might be presented on the AAAI Conference on Artificial Intelligence.

Quantifying uncertainty

In uncertainty quantification, a machine-learning model generates a numerical rating with each output to reflect its confidence in that prediction’s accuracy. Incorporating uncertainty quantification by constructing a latest model from scratch or retraining an existing model typically requires a considerable amount of data and expensive computation, which is usually impractical. What’s more, existing methods sometimes have the unintended consequence of degrading the standard of the model’s predictions.

The MIT and MIT-IBM Watson AI Lab researchers have thus zeroed in on the next problem: Given a pretrained model, how can they permit it to perform effective uncertainty quantification?

They solve this by making a smaller and simpler model, generally known as a metamodel, that attaches to the larger, pretrained model and uses the features that larger model has already learned to assist it make uncertainty quantification assessments.

“The metamodel could be applied to any pretrained model. It is healthier to have access to the internals of the model, because we will get rather more information in regards to the base model, but it can also work in the event you just have a final output. It will probably still predict a confidence rating,” Sattigeri says.

They design the metamodel to provide the uncertainty quantification output using a method that features each sorts of uncertainty: data uncertainty and model uncertainty. Data uncertainty is brought on by corrupted data or inaccurate labels and may only be reduced by fixing the dataset or gathering latest data. In model uncertainty, the model is just not sure how you can explain the newly observed data and might make incorrect predictions, more than likely since it hasn’t seen enough similar training examples. This issue is an especially difficult but common problem when models are deployed. In real-world settings, they often encounter data which can be different from the training dataset.

“Has the reliability of your decisions modified while you use the model in a latest setting? You wish some strategy to have faith in whether it’s working on this latest regime or whether it’s worthwhile to collect training data for this particular latest setting,” Wornell says.

Validating the quantification

Once a model produces an uncertainty quantification rating, the user still needs some assurance that the rating itself is accurate. Researchers often validate accuracy by making a smaller dataset, held out from the unique training data, after which testing the model on the held-out data. Nonetheless, this method doesn’t work well in measuring uncertainty quantification since the model can achieve good prediction accuracy while still being over-confident, Shen says.

They created a latest validation technique by adding noise to the information within the validation set — this noisy data is more like out-of-distribution data that could cause model uncertainty. The researchers use this noisy dataset to guage uncertainty quantifications.

They tested their approach by seeing how well a meta-model could capture several types of uncertainty for various downstream tasks, including out-of-distribution detection and misclassification detection. Their method not only outperformed all of the baselines in each downstream task but additionally required less training time to realize those results.

This system could help researchers enable more machine-learning models to effectively perform uncertainty quantification, ultimately aiding users in making higher decisions about when to trust predictions.

Moving forward, the researchers wish to adapt their technique for newer classes of models, akin to large language models which have a special structure than a standard neural network, Shen says.

The work was funded, partially, by the MIT-IBM Watson AI Lab and the U.S. National Science Foundation.

2 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here