NVIDIA Open Sources Audio2Face Animation Model

-


By leveraging large language and speech models, generative AI is creating intelligent 3D avatars that may engage users in natural conversation, from video games to customer support. To make these characters truly lifelike, they need human-like expressions. NVIDIA Audio2Face accelerates the creation of realistic digital characters by providing real-time facial animation and lip-sync driven by generative AI.

Today, NVIDIA is open sourcing our Audio2Face technology to speed up adoption of AI-powered avatars in games and 3D applications.

Video 1. Demo of the NVIDIA Audio2Face 3.0 diffusion model in motion

Audio2Face uses AI to generate realistic facial animations from audio input. It really works by analyzing acoustic features like phonemes and intonation to create a stream of animation data, which is then mapped to a personality’s facial poses. This data will be rendered offline for pre-scripted content or streamed in real-time for dynamic, AI-driven characters, providing accurate lip-sync and emotional expressions.

NVIDIA Audio2Face diagramNVIDIA Audio2Face diagram
Figure 1. Speech audio and emotional triggers generate facial animations and lip-sync.

NVIDIA is open sourcing the Audio2Face models and SDK so every game and 3D application developer can construct and deploy high fidelity characters with innovative animations. We’re also open sourcing the Audio2Face training framework, so anyone can fine-tune and customize our pre-existing models for specific use cases.

See the tables below for the whole list of open source tools and learn more at NVIDIA Developer.

Package Use
Audio2Face SDK Libraries and documentation for authoring and runtime facial animations on-device or within the cloud
Autodesk Maya plugin  Reference plugin (v2.0) with local execution that enables users to send audio inputs and receive facial animation for characters in Maya
Unreal Engine 5 plugin UE5 plugin (v2.5) for UE 5.5 and 5.6 that enables users to send audio inputs and receive facial animation for characters in Unreal Engine 5
Audio2Face Training Framework Framework (v1.0) to create Audio2Face models together with your data
Table 1. Audio2Face SDK and plugins
Package Use
Audio2Face Training Sample Data  Example data to start with the training framework
Audio2Face Models Regression (v2.2) and diffusion (v3.0) models to generate lip-sync
Audio2Emotion Models Production (v2.2) and experimental (v3.0) models to infer emotional state from audio
Table 2. Audio2Face models and training data

Open sourcing technology allows developers, students, and researchers to learn from and construct upon state-of-the-art code. This creates a feedback loop where the community can add latest features and optimize the technology for diverse use cases. We’re excited to make high-quality facial animation more accessible and may’t wait to see what the community creates with it. Join our NVIDIA Audio2Face developer community on Discord and share your latest work.

The industry-leading Audio2Face model is deployed widely across gaming, media and entertainment, and customer support industries. Quite a few ISVs and game developers, including Convai, Codemasters, GSC Games World, Inworld AI, NetEase, Reallusion, Perfect World Games, Streamlabs, and UneeQ Digital Humans have integrated Audio2Face of their applications. 

Video 2. NVIDIA Audio2Face technology in F1 25

Reallusion, who offers a platform for creators to construct 3D characters, integrated Audio2Face inside its suite of tools. “Audio2Face uses AI to create expressive, multilingual facial animation from audio,” said Elvis Huang, head of innovation at Reallusion, Inc. “Its seamless integration with Reallusion’s iClone, Character Creator, and iClone AI Assistant, plus advanced editing tools like face-key editing, face puppeteering, and AccuLip make it easier than ever to supply high-quality character animation.” 

Survios, developers of Alien: Rogue Incursion Evolved Edition, sped up their animation process, making it possible to deliver prime quality character experiences sooner. “By integrating Audio2Face into Evolved Edition, we streamlined the pipeline for lip-syncing and facial capture while ensuring a more immersive and authentic character experience for our players,” said Eugene Elkin, game director and lead engineer at Survios.

The Farm 51, creators of the Chernobylite game series, integrated Audio2Face of their latest game. “The combination of NVIDIA Audio2Face technology in Chernobylite 2: Exclusion Zone has been a game-changer for us,” said Wojciech Pazdur, creative director at The Farm 51. “It has allowed us to generate highly detailed facial animations directly from audio, saving countless hours of animation work. Ideas that were unimaginable in the unique Chernobylite are actually possible which brings a brand new level of realism and immersion to the characters, making their performances feel more authentic than ever.” 

Below are the opposite announcements for game developers released this month.

Latest updates to RTX Kit 

RTX Kit is our suite of neural rendering technologies to ray trace games with AI, render scenes with immense geometry, and create game characters with photo-realistic visuals.

RTX Neural Texture Compression SDK dramatically reduces memory usage of high-quality textures without sacrificing quality and has received a number of improvements including:

  • Library optimizations for very large texture sets and improved performance with Cooperative Vectors on DX12
  • Expanded feature set for the rendering sample, improved performance and DLSS support
  • Command-Line Tool improvements when compressing and decompressing very large texture sets
  • Recent Intel Sponza scene, great for benchmarking

RTX Global Illumination SDK provides ray-traced indirect lighting solutions and has also received improvements:

  • Addition of VSync choice to the pathtracer sample
  • Addition of cache visualization with material demodulation toggle.
  • Spatially Hashed Radiance Cache (SHaRC) algorithm removes compaction option, introduces optional material demodulation, additional debug pass and documentation updates

NVIDIA vGPU scales up the sport development environment 

NVIDIA virtual GPU (vGPU) technology enables GPU sharing amongst multiple users in a virtualized environment, allowing scalable GPU resources to support game developers across the complete organization. Activision overhauled its global integration, delivery, and deployment pipeline with NVIDIA vGPU, replacing 100 legacy servers with just six RTX GPU-powered units. The outcomes: 

  • 82% reduction in footprint
  • 72% drop in power usage
  • Over 250,000 tasks run every day across 3,000 developers and 500+ systems 

Video 3. Activision created a worldwide testing and deployment platform with NVIDIA vGPU 

By consolidating infrastructure and enabling dynamic GPU allocation, Activision built a scalable, automated testing platform that supports every thing from multiplayer validation to visual regression and performance testing, accelerating iteration speed and raising code quality across the board.

Explore the Activision story to see how centralized GPU scheduling is redefining AAA development pipelines.

Graphics development and performance tuning sessions from SIGGRAPH 2025

NVIDIA hosted a variety of coaching sessions and technical presentations. Of particular interest to game developers were hands-on labs showcasing the newest advancements within the Nsight suite of graphics developer tools. Recordings of those sessions are actually available to stream on NVIDIA On-Demand. 

Nsight Graphics in Motion: Develop and Debug Modern Ray-Tracing Applications focuses on inspection and debugging of frames to discover and diagnose common rendering bugs and performance blockers, including use of the brand new Graphics Capture tool that gives expanded and modernized workflows. 

Nsight Graphics in Motion: Optimize Shaders in Modern Ray-Tracing Applications is a deep dive into the GPU Trace Profiler, which helps you to drill down into individual lines of shader code to search out runtime execution bottlenecks. 

Optimize VRAM Management With NVIDIA Nsight Systems shows easy methods to attain a holistic view of application performance and utilization of resources across each the CPU and GPU using traces that will be minutes long. Special emphasis is given to the brand new Graphics Hotspot Evaluation tool which converts raw timeline data right into a web-based interface with easy-to-read summaries of concurrency evaluation, frame stutters, and more.

Download Nsight Graphics and Nsight Systems to start optimizing your personal games and graphics applications. 

What’s Next

In case you weren’t capable of catch our “Level up with NVIDIA” webinar episode this morning on RTX Mega Geometry in Unreal Engine 5.6, make sure to catch it on-demand here

See our full list of game developer resources here and follow us to not sleep to this point with the newest NVIDIA game development news: 



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x