Synthesia’s AI clones are more expressive than ever. Soon they’ll give you the option to speak back.

When Synthesia launched in 2017, its primary purpose was to match AI versions of real human faces—for instance, the previous footballer David Beckham—with dubbed voices speaking in several languages. A couple of years later, in 2020, it began giving the businesses that signed up for its services the chance to make professional-level presentation videos starring either AI versions of staff members or consenting actors. However the technology wasn’t perfect. The avatars’ body movements might be jerky and unnatural, their accents sometimes slipped, and the emotions indicated by their voices didn’t at all times match their facial expressions.

Now Synthesia’s avatars have been updated with more natural mannerisms and movements, in addition to expressive voices that higher preserve the speaker’s accent—making them appear more humanlike than ever before. For Synthesia’s corporate clients, these avatars will make for slicker presenters of monetary results, internal communications, or staff training videos.

I discovered the video demonstrating my avatar as unnerving because it is technically impressive. It’s slick enough to pass as a high-definition recording of a chirpy corporate speech, and in case you didn’t know me, you’d probably think that’s exactly what it was. This demonstration shows how much harder it’s becoming to tell apart the unreal from the true. And before long, these avatars will even give you the option to speak back to us. But how significantly better can they get? And what might interacting with AI clones do to us?

The creation process

When my former colleague Melissa visited Synthesia’s London studio to create an avatar of herself last 12 months, she needed to undergo an extended strategy of calibrating the system, reading out a script in several emotional states, and mouthing the sounds needed to assist her avatar form vowels and consonants. As I stand within the brightly lit room 15 months later, I’m relieved to listen to that the creation process has been significantly streamlined. Josh Baker-Mendoza, Synthesia’s technical supervisor, encourages me to gesture and move my hands as I might during natural conversation, while concurrently warning me not to maneuver an excessive amount of. I duly repeat a very glowing script that’s designed to encourage me to talk emotively and enthusiastically. The result’s a bit as if if Steve Jobs had been resurrected as a blond British woman with a low, monotonous voice.

It also has the unlucky effect of creating me sound like an worker of Synthesia.“I’m so thrilled to be with you today to point out off what we’ve been working on. We’re on the sting of innovation, and the probabilities are infinite,” I parrot eagerly, attempting to sound vigorous reasonably than manic. “So get able to be a part of something that may make you go, ‘Wow!’ This chance isn’t just big—it’s monumental.”

Just an hour later, the team has all of the footage it needs. A few weeks later I receive two avatars of myself: one powered by the previous Express-1 model and the opposite made with the most recent Express-2 technology. The latter, Synthesia claims, makes its synthetic humans more lifelike and true to the people they’re modeled on, complete with more expressive hand gestures, facial movements, and speech. You’ll be able to see the outcomes for yourself below.

COURTESY SYNTHESIA

Last 12 months, Melissa found that her Express-1-powered avatar did not match her transatlantic accent. Its range of emotions was also limited—when she asked her avatar to read a script angrily, it sounded more whiny than furious. Within the months since, Synthesia has improved Express-1, however the version of my avatar made with the identical technology blinks furiously and still struggles to synchronize body movements with speech.

By means of contrast, I’m struck by just how much my recent Express-2 avatar looks like me: Its facial expression mirror my very own perfectly. Its voice is spookily accurate too, and even though it gesticulates greater than I do, its hand movements generally marry up with what I’m saying.

Synthesia’s AI clones are more expressive than ever. Soon they’ll give you the option to speak back.

The creation process

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Scaling innovation in manufacturing with AI

Exclusive interview with Google DeepMind CEO Demis Hassabis

Making Smarter Bets: Towards a Winning AI Strategy with Probabilistic Considering

MIT Energy Initiative conference spotlights research priorities amidst a changing energy landscape

Networking for AI: Constructing the inspiration for real-time intelligence

Synthesia’s AI clones are more expressive than ever. Soon they’ll give you the option to speak back.

The creation process

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.