Home Artificial Intelligence Meta’s latest AI models can recognize and produce speech for greater than 1,000 languages

Meta’s latest AI models can recognize and produce speech for greater than 1,000 languages

0
Meta’s latest AI models can recognize and produce speech for greater than 1,000 languages

They trained it on two latest data sets: one which accommodates audio recordings of the Latest Testament Bible and its corresponding text taken from the web in 1,107 languages, and one other containing unlabeled Latest Testament audio recordings in 3,809 languages. The team processed the speech audio and the text data to enhance its quality before running an algorithm designed to align audio recordings with accompanying text. They then repeated this process with a second algorithm trained on the newly aligned data. With this method, the researchers were capable of teach the algorithm to learn a latest language more easily, even without the accompanying text.

“We will use what that model learned to then quickly construct speech systems with very, little or no data,” says Michael Auli, a research scientist at Meta who worked on the project.

“For English, now we have lots and numerous good data sets, and now we have that for a couple of more languages, but we just don’t have that for languages which are spoken by, say, 1,000 people.” 

The researchers say their models can converse in over 1,000 languages but recognize greater than 4,000. 

They compared the models with those from rival firms, including OpenAI Whisper, and claim theirs had half the error rate, despite covering 11 times more languages.

Nonetheless, the team warns the model continues to be prone to mistranscribing certain words or phrases, which could lead to inaccurate or potentially offensive labels. Additionally they acknowledge that their speech recognition models yielded more biased words than other models, albeit only 0.7% more. 

While the scope of the research is impressive, the use of spiritual texts to coach AI models will be controversial, says Chris Emezue, a researcher at Masakhane, a company working on natural-language processing for African languages, who was not involved within the project.

“The Bible has a number of bias and misrepresentations,” he says.

LEAVE A REPLY

Please enter your comment!
Please enter your name here