Teaching AI models what they don’t know

Artificial intelligence systems like ChatGPT provide plausible-sounding answers to any query you would possibly ask. But they don’t at all times reveal the gaps of their knowledge or areas where they’re uncertain. That problem can have huge consequences as AI systems are increasingly used to do things like develop drugs, synthesize information, and drive autonomous cars.

Now, the MIT spinout Themis AI helps quantify model uncertainty and proper outputs before they cause larger problems. The corporate’s Capsa platform can work with any machine-learning model to detect and proper unreliable outputs in seconds. It really works by modifying AI models to enable them to detect patterns of their data processing that indicate ambiguity, incompleteness, or bias.

“The thought is to take a model, wrap it in Capsa, discover the uncertainties and failure modes of the model, after which enhance the model,” says Themis AI co-founder and MIT Professor Daniela Rus, who can be the director of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). “We’re enthusiastic about offering an answer that may improve models and offer guarantees that the model is working accurately.”

Rus founded Themis AI in 2021 with Alexander Amini ’17, SM ’18, PhD ’22 and Elaheh Ahmadi ’20, MEng ’21, two former research affiliates in her lab. Since then, they’ve helped telecom firms with network planning and automation, helped oil and gas firms use AI to know seismic imagery, and published papers on developing more reliable and trustworthy chatbots.

“We would like to enable AI within the highest-stakes applications of each industry,” Amini says. “We’ve all seen examples of AI hallucinating or making mistakes. As AI is deployed more broadly, those mistakes could lead on to devastating consequences. Themis makes it possible that any AI can forecast and predict its own failures, before they occur.”

Helping models know what they don’t know

Rus’ lab has been researching model uncertainty for years. In 2018, she received funding from Toyota to review the reliability of a machine learning-based autonomous driving solution.

“That may be a safety-critical context where understanding model reliability could be very essential,” Rus says.

In separate work, Rus, Amini, and their collaborators built an algorithm that might detect racial and gender bias in facial recognition systems and mechanically reweight the model’s training data, showing it eliminated bias. The algorithm worked by identifying the unrepresentative parts of the underlying training data and generating latest, similar data samples to rebalance it.

In 2021, the eventual co-founders showed a similar approach could possibly be used to assist pharmaceutical firms use AI models to predict the properties of drug candidates. They founded Themis AI later that yr.

“Guiding drug discovery could potentially save numerous money,” Rus says. “That was the use case that made us realize how powerful this tool could possibly be.”

Today Themis AI is working with enterprises in a wide range of industries, and lots of of those firms are constructing large language models. Through the use of Capsa, these models are capable of quantify their very own uncertainty for every output.

“Many firms are keen on using LLMs which can be based on their data, but they’re concerned about reliability,” observes Stewart Jamieson SM ’20, PhD ’24, Themis AI’s head of technology. “We help LLMs self-report their confidence and uncertainty, which enables more reliable query answering and flagging unreliable outputs.”

Themis AI can be in discussions with semiconductor firms constructing AI solutions on their chips that may work outside of cloud environments.

“Normally these smaller models that work on phones or embedded systems aren’t very accurate in comparison with what you might run on a server, but we are able to get one of the best of each worlds: low latency, efficient edge computing without sacrificing quality,” Jamieson explains. “We see a future where edge devices do many of the work, but each time they’re unsure of their output, they will forward those tasks to a central server.”

Pharmaceutical firms can even use Capsa to enhance AI models getting used to discover drug candidates and predict their performance in clinical trials.

“The predictions and outputs of those models are very complex and hard to interpret — experts spend numerous effort and time attempting to make sense of them,” Amini remarks. “Capsa may give insights right out of the gate to know if the predictions are backed by evidence within the training set or are only speculation without numerous grounding. That may speed up the identification of the strongest predictions, and we predict that has an enormous potential for societal good.”

Research for impact

Themis AI’s team believes the corporate is well-positioned to enhance the innovative of continuously evolving AI technology. As an example, the corporate is exploring Capsa’s ability to enhance accuracy in an AI technique often called chain-of-thought reasoning, wherein LLMs explain the steps they take to get to a solution.

“We’ve seen signs Capsa could help guide those reasoning processes to discover the highest-confidence chains of reasoning,” Jamieson says. “We expect that has huge implications when it comes to improving the LLM experience, reducing latencies, and reducing computation requirements. It’s a particularly high-impact opportunity for us.”

For Rus, who has co-founded several firms since coming to MIT, Themis AI is a possibility to make sure her MIT research has impact.

“My students and I even have change into increasingly enthusiastic about going the additional step to make our work relevant for the world,” Rus says. “AI has tremendous potential to rework industries, but AI also raises concerns. What excites me is the chance to assist develop technical solutions that address these challenges and in addition construct trust and understanding between people and the technologies which can be becoming a part of their every day lives.”

Teaching AI models what they don’t know

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

How Vision Language Models Are Trained from “Scratch”

Why Care About Prompt Caching in LLMs?

Supply-chain attack using invisible code hits GitHub and other repositories

Introducing NVIDIA NeMo Retriever’s Generalizable Agentic Retrieval Pipeline

Why physical AI is becoming manufacturing’s next advantage

Teaching AI models what they don’t know

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.