Announcing Gemma 3n preview: powerful, efficient, mobile-first AI

-


Following the exciting launches of Gemma 3 and Gemma 3 QAT, our family of state-of-the-art open models able to running on a single cloud or desktop accelerator, we’re pushing our vision for accessible AI even further. Gemma 3 delivered powerful capabilities for developers, and we’re now extending that vision to highly capable, real-time AI operating directly on the devices you employ every single day – your phones, tablets, and laptops.

To power the subsequent generation of on-device AI and support a various range of applications, including advancing the capabilities of Gemini Nano, we engineered a brand new, cutting-edge architecture. This next-generation foundation was created in close collaboration with mobile hardware leaders like Qualcomm Technologies, MediaTek, and Samsung’s System LSI business, and is optimized for lightning-fast, multimodal AI, enabling truly personal and personal experiences directly in your device.

Gemma 3n is our first open model built on this groundbreaking, shared architecture, allowing developers to start experimenting with this technology today in an early preview. The identical advanced architecture also powers the subsequent generation of Gemini Nano, which brings these capabilities to a broad range of features in Google apps and our on-device ecosystem, and can turn into available later this 12 months. Gemma 3n allows you to start constructing on this foundation that can come to major platforms resembling Android and Chrome.

Chatbot Arena Elo scores

This chart ranks AI models by Chatbot Arena Elo scores; higher scores (top numbers) indicate greater user preference. Gemma 3n ranks highly amongst each popular proprietary and open models.

Gemma 3n leverages a Google DeepMind innovation called Per-Layer Embeddings (PLE) that delivers a major reduction in RAM usage. While the raw parameter count is 5B and 8B, this innovation means that you can run larger models on mobile devices or live-stream from the cloud, with a memory overhead comparable to a 2B and 4B model, meaning the models can operate with a dynamic memory footprint of just 2GB and 3GB. Learn more in our documentation.

By exploring Gemma 3n, developers can get an early preview of the open model’s core capabilities and mobile-first architectural innovations that shall be available on Android and Chrome with Gemini Nano.

On this post, we’ll explore Gemma 3n’s latest capabilities, our approach to responsible development, and the way you possibly can access the preview today.


Key Capabilities of Gemma 3n

Engineered for fast, low-footprint AI experiences running locally, Gemma 3n delivers:

  • Optimized On-Device Performance & Efficiency: Gemma 3n starts responding roughly 1.5x faster on mobile with significantly higher quality (in comparison with Gemma 3 4B) and a reduced memory footprint achieved through innovations like Per Layer Embeddings, KVC sharing, and advanced activation quantization.
  • Many-in-1 Flexibility: A model with a 4B lively memory footprint that natively features a nested state-of-the-art 2B lively memory footprint submodel (because of MatFormer training). This provides flexibility to dynamically trade off performance and quality on the fly without hosting separate models. We further introduce mix’n’match capability in Gemma 3n to dynamically create submodels from the 4B model that may optimally suit your specific use case — and associated quality/latency tradeoff. Stay tuned for more on this research in our upcoming technical report.
  • Privacy-First & Offline Ready: Local execution enables features that respect user privacy and performance reliably, even without a web connection.
  • Expanded Multimodal Understanding with Audio: Gemma 3n can understand and process audio, text, and pictures, and offers significantly enhanced video understanding. Its audio capabilities enable the model to perform high-quality Automatic Speech Recognition (transcription) and Translation (speech to translated text). Moreover, the model accepts interleaved inputs across modalities, enabling understanding of complex multimodal interactions. (Public implementation coming soon)
  • Improved Multilingual Capabilities: Improved multilingual performance, particularly in Japanese, German, Korean, Spanish, and French. Strong performance reflected on multilingual benchmarks resembling 50.1% on WMT24++ (ChrF).

MMLU performance

This chart show’s MMLU performance vs model size of Gemma 3n’s mix-n-match (pretrained) capability.

Unlocking Recent On-the-go Experiences

Gemma 3n will empower a brand new wave of intelligent, on-the-go applications by enabling developers to:

  1. Construct live, interactive experiences that understand and reply to real-time visual and auditory cues from the user’s environment.


2. Power deeper understanding and contextual text generation using combined audio, image, video, and text inputs—all processed privately on-device.


3. Develop advanced audio-centric applications, including real-time speech transcription, translation, and wealthy voice-driven interactions.

Here’s an summary and the forms of experiences you possibly can construct:

Constructing Responsibly, Together

Our commitment to responsible AI development is paramount. Gemma 3n, like all Gemma models, underwent rigorous safety evaluations, data governance, and fine-tuning alignment with our safety policies. We approach open models with careful risk assessment, continually refining our practices because the AI landscape evolves.


Get Began: Preview Gemma 3n Today

We’re excited to get Gemma 3n into your hands through a preview starting today:


Initial Access (Available Now):

  • Cloud-based Exploration with Google AI Studio: Try Gemma 3n directly in your browser on Google AI Studio – no setup needed. Explore its text input capabilities immediately.
  • On-Device Development with Google AI Edge: For developers seeking to integrate Gemma 3n locally, Google AI Edge provides tools and libraries. You possibly can start with text and image understanding/generation capabilities today.

Gemma 3n marks the subsequent step in democratizing access to cutting-edge, efficient AI. We’re incredibly excited to see what you’ll construct as we make this technology progressively available, starting with today’s preview.

Explore this announcement and all Google I/O 2025 updates on io.google starting May 22.



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x