Artificial intelligence holds promise for helping doctors diagnose patients and personalize treatment options. Nonetheless, a world group of scientists led by MIT cautions that AI systems, as currently designed, carry the chance of steering doctors within the improper direction because they could overconfidently make incorrect decisions.
One strategy to prevent these mistakes is to program AI systems to be more “humble,” based on the researchers. Such systems would reveal after they usually are not confident of their diagnoses or recommendations and would encourage users to assemble additional information when the diagnosis is uncertain.
“We’re now using AI as an oracle, but we will use AI as a coach. We could use AI as a real co-pilot. That will not only increase our ability to retrieve information but increase our agency to give you the chance to attach the dots,” says Leo Anthony Celi, a senior research scientist at MIT’s Institute for Medical Engineering and Science, a physician at Beth Israel Deaconess Medical Center, and an associate professor at Harvard Medical School.
Celi and his colleagues have created a framework that they are saying can guide AI developers in designing systems that display curiosity and humility. This latest approach could allow doctors and AI systems to work as partners, the researchers say, and help prevent AI from exerting an excessive amount of influence over doctors’ decisions.
Celi is the senior creator of the study, which appears today in . The paper’s lead creator is Sebastián Andrés Cajas Ordoñez, a researcher at MIT Critical Data, a worldwide consortium led by the Laboratory for Computational Physiology inside the MIT Institute for Medical Engineering and Science.
Instilling human values
Overconfident AI systems can result in errors in medical settings, based on the MIT team. Previous studies have found that ICU physicians defer to AI systems that they perceive as reliable even when their very own intuition goes against the AI suggestion. Physicians and patients alike are more likely to just accept incorrect AI recommendations after they are perceived as authoritative.
Instead of systems that provide overconfident but potentially incorrect advice, health care facilities must have access to AI systems that work more collaboratively with clinicians, the researchers say.
“We try to incorporate humans in these human-AI systems, in order that we’re facilitating humans to collectively reflect and reimagine, as an alternative of getting isolated AI agents that do all the pieces. We would like humans to turn into more creative through the usage of AI,” Cajas Ordoñez says.
To create such a system, the consortium designed a framework that features several computational modules that may be incorporated into existing AI systems. The primary of those modules requires an AI model to judge its own certainty when making diagnostic predictions. Developed by consortium members Janan Arslan and Kurt Benke of the University of Melbourne, the Epistemic Virtue Rating acts as a self-awareness check, ensuring the system’s confidence is appropriately tempered by the inherent uncertainty and complexity of every clinical scenario.
With that self-awareness in place, the model can tailor its response to the situation. If the system detects that its confidence exceeds what the available evidence supports, it might probably pause and flag the mismatch, requesting specific tests or history that may resolve the uncertainty, or recommending specialist consultation. The goal is an AI that not only provides answers but in addition signals when those answers must be treated with caution.
“It’s like having a co-pilot that may inform you that it is advisable to seek a fresh pair of eyes to give you the chance to know this complex patient higher,” Celi says.
Celi and his colleagues have previously developed large-scale databases that may be used to coach AI systems, including the Medical Information Mart for Intensive Care (MIMIC) database from Beth Israel Deaconess Medical Center. His team is now working on implementing the brand new framework into AI systems based on MIMIC and introducing it to clinicians within the Beth Israel Lahey Health system.
This approach is also implemented in AI systems which are used to research X-ray images or to find out the most effective treatment options for patients within the emergency room, amongst others, the researchers say.
Toward more inclusive AI
This study is a component of a bigger effort by Celi and his colleagues to create AI systems which are designed by and for the people who find themselves ultimately going to be most impacted by these tools. Many AI models, reminiscent of MIMIC, are trained on publicly available data from the US, which may result in the introduction of biases toward a certain way of serious about medical issues, and exclusion of others.
Bringing in additional viewpoints is critical to overcoming these potential biases, says Celi, emphasizing that every member of the worldwide consortium brings a definite perspective to a broader, collective understanding.
One other problem with existing AI systems used for diagnostics is that they are often trained on electronic health records, which weren’t originally intended for that purpose. Which means the information lack much of the context that may be useful in making diagnoses and treatment recommendations. Moreover, many patients never get included in those datasets due to lack of access, reminiscent of individuals who live in rural areas.
At data workshops hosted by MIT Critical Data, groups of knowledge scientists, health care professionals, social scientists, patients, and others work together on designing latest AI systems. Before starting, everyone seems to be prompted to take into consideration whether the information they’re using captures all of the drivers of whatever they aim to predict, ensuring they don’t inadvertently encode existing structural inequities into their models.
“We make them query the dataset. Are they confident about their training data and validation data? Do they think that there are patients that were excluded, unintentionally or intentionally, and the way will that affect the model itself?” he says. “After all, we cannot stop and even delay the event of AI, not only in health care, but in every sector. But, we have to be more deliberate and thoughtful in how we do that.”
The research was funded by the Boston-Korea Progressive Research Project through the Korea Health Industry Development Institute.
