Automated system teaches users when to collaborate with an AI assistant

Artificial Intelligence

Automated system teaches users when to collaborate with an AI assistant

admin

December 8, 2023

Automated system teaches users when to collaborate with an AI assistant

Artificial intelligence models that select patterns in images can often accomplish that higher than human eyes — but not all the time. If a radiologist is using an AI model to assist her determine whether a patient’s X-rays show signs of pneumonia, when should she trust the model’s advice and when should she ignore it?

A customized onboarding process could help this radiologist answer that query, based on researchers at MIT and the MIT-IBM Watson AI Lab. They designed a system that teaches a user when to collaborate with an AI assistant.

On this case, the training method might find situations where the radiologist trusts the model’s advice — except she shouldn’t since the model is incorrect. The system mechanically learns rules for the way she should collaborate with the AI, and describes them with natural language.

During onboarding, the radiologist practices collaborating with the AI using training exercises based on these rules, receiving feedback about her performance and the AI’s performance.

The researchers found that this onboarding procedure led to a few 5 percent improvement in accuracy when humans and AI collaborated on a picture prediction task. Their results also show that just telling the user when to trust the AI, without training, led to worse performance.

Importantly, the researchers’ system is fully automated, so it learns to create the onboarding process based on data from the human and AI performing a particular task. It will possibly also adapt to different tasks, so it could be scaled up and utilized in many situations where humans and AI models work together, similar to in social media content moderation, writing, and programming.

“So often, individuals are given these AI tools to make use of with none training to assist them determine when it will be helpful. That’s not what we do with nearly every other tool that individuals use — there is nearly all the time some kind of instructional that comes with it. But for AI, this appears to be missing. We try to tackle this problem from a methodological and behavioral perspective,” says Hussein Mozannar, a graduate student within the Social and Engineering Systems doctoral program throughout the Institute for Data, Systems, and Society (IDSS) and lead creator of a paper about this training process.

The researchers envision that such onboarding will probably be an important part of coaching for medical professionals.

“One could imagine, for instance, that doctors making treatment decisions with the assistance of AI will first need to do training just like what we propose. We might have to rethink all the things from continuing medical education to the best way clinical trials are designed,” says senior creator David Sontag, a professor of EECS, a member of the MIT-IBM Watson AI Lab and the MIT Jameel Clinic, and the leader of the Clinical Machine Learning Group of the Computer Science and Artificial Intelligence Laboratory (CSAIL).

Mozannar, who can also be a researcher with the Clinical Machine Learning Group, is joined on the paper by Jimin J. Lee, an undergraduate in electrical engineering and computer science; Dennis Wei, a senior research scientist at IBM Research; and Prasanna Sattigeri and Subhro Das, research staff members on the MIT-IBM Watson AI Lab. The paper will probably be presented on the Conference on Neural Information Processing Systems.

Training that evolves

Existing onboarding methods for human-AI collaboration are sometimes composed of coaching materials produced by human experts for specific use cases, making them difficult to scale up. Some related techniques depend on explanations, where the AI tells the user its confidence in each decision, but research has shown that explanations are rarely helpful, Mozannar says.

“The AI model’s capabilities are continuously evolving, so the use cases where the human could potentially profit from it are growing over time. At the identical time, the user’s perception of the model continues changing. So, we want a training procedure that also evolves over time,” he adds.

To perform this, their onboarding method is mechanically learned from data. It’s built from a dataset that accommodates many instances of a task, similar to detecting the presence of a traffic light from a blurry image.

The system’s first step is to gather data on the human and AI performing this task. On this case, the human would attempt to predict, with the assistance of AI, whether blurry images contain traffic lights.

The system embeds these data points onto a latent space, which is a representation of information wherein similar data points are closer together. It uses an algorithm to find regions of this space where the human collaborates incorrectly with the AI. These regions capture instances where the human trusted the AI’s prediction however the prediction was incorrect, and vice versa.

Perhaps the human mistakenly trusts the AI when images show a highway at night.

After discovering the regions, a second algorithm utilizes a big language model to explain each region as a rule, using natural language. The algorithm iteratively fine-tunes that rule by finding contrasting examples. It would describe this region as “ignore AI when it’s a highway through the night.”

These rules are used to construct training exercises. The onboarding system shows an example to the human, on this case a blurry highway scene at night, in addition to the AI’s prediction, and asks the user if the image shows traffic lights. The user can answer yes, no, or use the AI’s prediction.

If the human is incorrect, they’re shown the proper answer and performance statistics for the human and AI on these instances of the duty. The system does this for every region, and at the top of the training process, repeats the exercises the human got incorrect.

“After that, the human has learned something about these regions that we hope they are going to take away in the longer term to make more accurate predictions,” Mozannar says.

Onboarding boosts accuracy

The researchers tested this method with users on two tasks — detecting traffic lights in blurry images and answering multiple alternative questions from many domains (similar to biology, philosophy, computer science, etc.).

They first showed users a card with information in regards to the AI model, the way it was trained, and a breakdown of its performance on broad categories. Users were split into five groups: Some were only shown the cardboard, some went through the researchers’ onboarding procedure, some went through a baseline onboarding procedure, some went through the researchers’ onboarding procedure and got recommendations of after they should or mustn’t trust the AI, and others were only given the recommendations.

Only the researchers’ onboarding procedure without recommendations improved users’ accuracy significantly, boosting their performance on the traffic light prediction task by about 5 percent without slowing them down. Nevertheless, onboarding was not as effective for the question-answering task. The researchers consider it’s because the AI model, ChatGPT, provided explanations with each answer that convey whether it must be trusted.

But providing recommendations without onboarding had the other effect — users not only performed worse, they took more time to make predictions.

“Once you only give someone recommendations, it looks as if they get confused and don’t know what to do. It derails their process. People also don’t like being told what to do, so that may be a factor as well,” Mozannar says.

Providing recommendations alone could harm the user if those recommendations are incorrect, he adds. With onboarding, then again, the largest limitation is the quantity of accessible data. If there aren’t enough data, the onboarding stage won’t be as effective, he says.

In the longer term, he and his collaborators need to conduct larger studies to judge the short- and long-term effects of onboarding. In addition they need to leverage unlabeled data for the onboarding process, and find methods to effectively reduce the variety of regions without omitting essential examples.

“Persons are adopting AI systems willy-nilly, and indeed AI offers great potential, but these AI agents still sometimes makes mistakes. Thus, it’s crucial for AI developers to plan methods that help humans know when it’s secure to depend on the AI’s suggestions,” says Dan Weld, professor emeritus on the Paul G. Allen School of Computer Science and Engineering on the University of Washington, who was not involved with this research. “Mozannar et al. have created an revolutionary method for identifying situations where the AI is trustworthy, and (importantly) to explain them to people in a way that leads to higher human-AI team interactions.”

This work is funded, partially, by the MIT-IBM Watson AI Lab.

LEAVE A REPLY Cancel reply