AIOLA Launches Voice Recognition Model 50% Faster than OpenAI’s ‘Whisper’

(Photo = aiola)

Israeli artificial intelligence (AI) startup aiOla has released a voice recognition model that’s 50% faster than OpenAI’s ‘Whisper’. This has made it possible to construct an AI system that may understand and answer users’ questions in near real time.

VentureBeat reported on the first (local time) that aiOLA released ‘Whisper-Medusa’, an open source voice recognition model that doubled the speed by modifying the Whisper architecture.

Whisper converts user audio into text, queries it to a Large Language Model (LLM), and converts LLM answers from text back into audio.

It has turn out to be the usual in speech recognition because of its ability to process complex speech in multiple languages and accents in near real time. It’s downloaded greater than 5 million times a month and is powered by tens of hundreds of apps.

Whisper-Medusa (left) and Whisper speed comparison (Photo = aiOLA)

AIOLA’s Whisper-Medusa modifies the Whisper architecture and adds a ‘multi-head attention’ mechanism.

Multihead attention divides the ‘self-attention’, which is how each element of the input sequence is said to other elements within the sequence, into multiple heads and performs it in parallel. It could possibly handle more complex relationships between input tokens, allowing the model to capture various sorts of dependencies between input tokens and concurrently mix information from various sources.

It’s explained that expressive power could be improved by handling more complex relationships between input tokens, and processing speed could be increased by applying attention to multiple parts concurrently.

The architectural changes allow Whisper-Medusa to predict 10 tokens at a time as an alternative of 1, leading to 50% faster speech prediction and generation runtime with none performance degradation. aiOLA plans to increase Whisper-Medusa to a 20-head version that may predict 20 tokens at a time.

Currently Whisper-Medusa Hugging FaceIt is offered for research and business use.

Reporter Park Chan cpark@aitimes.com

AIOLA Launches Voice Recognition Model 50% Faster than OpenAI’s ‘Whisper’

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

The Complete Guide to AI Implementation for Chief Data & AI Officers in 2026

From Dashboards to Decisions: Rethinking Data & Analytics within the Age of AI

Constructing NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety

The best way to Make Claude Code Improve from its Own Mistakes

Production-Ready LLM Agents: A Comprehensive Framework for Offline Evaluation

AIOLA Launches Voice Recognition Model 50% Faster than OpenAI’s ‘Whisper’

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.