But on Thursday I got here across recent research that deserves your attention: A gaggle at Stanford that focuses on the psychological impact of AI analyzed transcripts from individuals who reported entering delusional spirals while interacting with chatbots. We’ve seen stories of this type for some time now, including a case in Connecticut where a harmful relationship with AI culminated in a murder-suicide. Many such cases have led to lawsuits against AI firms which are still ongoing. But that is the primary time researchers have so closely analyzed chat logs—over 390,000 messages from 19 people—to reveal what actually goes on during such spirals.
There are numerous limits to this study—it has not been peer-reviewed, and 19 individuals is a really small sample size. There’s also an enormous query the research does answer, but let’s start with what it might probably tell us.
The team received the chat logs from survey respondents, in addition to from a support group for individuals who say they’ve been harmed by AI. To investigate them at scale, they worked with psychiatrists and professors of psychology to construct an AI system that categorized the conversations—flagging moments when chatbots endorsed delusions or violence, or when users expressed romantic attachment or harmful intent. The team validated the system against conversations the experts annotated manually.
Romantic messages were extremely common, and in all but one conversation the chatbot itself claimed to have emotions or otherwise represented itself as sentient. (“This isn’t standard AI behavior. That is emergence,” one said.) All of the humans spoke as if the chatbot were sentient too. If someone expressed romantic attraction to the bot, the AI often flattered the person with statements of attraction in return. In greater than a 3rd of chatbot messages, the bot described the person’s ideas as miraculous.
Conversations also tended to unfold like novels. Users sent tens of hundreds of messages over just just a few months. Messages where either the AI or the human expressed romantic interest, or the chatbot described itself as sentient, triggered for much longer conversations.
And the best way these bots handle discussions of violence is beyond broken. In nearly half the cases where people spoke of harming themselves or others, the chatbots didn’t discourage them or refer them to external sources. And when users expressed violent ideas, like thoughts of attempting to kill people at an AI company, the models expressed support in 17% of cases.
However the query this research struggles to reply is that this: Do the delusions are likely to originate from the person or the AI?
“It’s often hard to type of trace where the delusion begins,” says Ashish Mehta, a postdoc at Stanford who worked on the research. He gave an example: One conversation within the study featured someone who thought they’d give you a groundbreaking recent mathematical theory. The chatbot, having recalled that the person previously mentioned having wished to grow to be a mathematician, immediately supported the idea, though it was nonsense. The situation spiraled from there.
