It has been reported that a serious hallucination problem was discovered in OpenAI’s voice-to-text transcription tool ‘Whisper’, which is widely used all over the world.
AP reported on the twenty sixth (local time) that Whisper, OpenAI’s voice-to-text conversion artificial intelligence (AI) model, showed an inclination to hallucinate, making up parts of text or entire sentences.
In line with this, researchers on the University of Michigan announced that they found hallucinations, or the phenomenon of creating up content that doesn’t exist, in 8 out of 10 audio transcriptions.
Moreover, a machine learning engineer studied greater than 100 hours of Whisper manuscripts and located hallucinations in greater than half of the manuscripts. There have been also reports that hallucinations were present in many of the 26,000 manuscripts made with Whisper.
Although there have been many complaints in regards to the hallucinations of generative AI, it’s somewhat surprising that such an issue occurs in a comparatively easy transcription task that must faithfully follow the audio content.
Researchers aren’t sure why whispers cause hallucinations, but they note that they have an inclination to occur during temporary pauses or while background noise or music is playing.
Specifically, as using Whisper-based tools expands within the medical field, concerns are growing about hallucination errors that may result in serious consequences.
For instance, the Whisper-based transcription tool developed by Navla within the US is currently utilized by greater than 30,000 clinicians and 40 health systems. “This tool has been used to record roughly 7 million medical visits,” Navla said. “We all know that Whisper is experiencing hallucinations, and we’re addressing this issue.”
In response, OpenAI said, “We’re repeatedly working to enhance the accuracy of our models, including reducing hallucinations,” and added, “In line with our usage policy, we prohibit using Whisper in certain high-risk decision-making situations.”
Whisper was first released as open source by OpenAI in September 2022, and V3 was released at Dev Day in November last 12 months following V2 in December of the identical 12 months.
Reporter Park Chan cpark@aitimes.com