Google has open-sourced an on-device artificial intelligence (AI) model with 2.6 billion parameters. Google claims that the model outperforms larger models comparable to OpenAI’s ‘GPT-3.5’ and Mistral’s ‘Mixtral 8x7B’.
VentureBeat reported on the thirty first (local time) that Google had released ‘Gemma 2 2B’, an ultra-small open source model with 2.6 billion parameters.
Accordingly, the Gemma 2 2B model scored 1130 points on LMSYS’ Chatbot Arena, which evaluates human preferences, barely ahead of the 1117 points of ‘GPT-3.5-Turbo’ and the 1114 points of Mixtral-8x7B, which have 10 times more parameters.
It also showed significant improvement over the previous version, scoring 56.1 points in MMLU, a benchmark for measuring reasoning ability, and 36.6 points in MBPP, a coding-related benchmark.

That is an example showing that the difference in parameter scale might be compensated for by sophisticated training techniques, efficient architectures, and high-quality datasets.
Particularly, Gemma 2 2B emphasizes the importance of model compression and distillation techniques. Distillation is the means of constructing a dataset using a bigger model to coach a smaller model.
The logic is that by effectively distilling knowledge from large models to smaller ones, we will create more accessible AI models without sacrificing performance. This approach not only reduces computing requirements, but additionally addresses the problem of reducing the environmental impact of coaching and running large AI models.
Gemma 2 2B can be a multilingual model trained on a 2 trillion-token dataset using Google’s TPU v5e chip.
Meanwhile, Google also unveiled ShieldGemma, a cutting-edge safety classifier designed to detect and mitigate harmful content in AI model inputs and outputs.
ShieldGemma is a group of secure content classification models based on Gemma 2 that filter harmful content in 4 areas: hate speech, harassment, explicit sexual content, and dangerous content. The 2B model is right for online classification tasks, while the 9B and 27B versions provide higher performance for offline applications where latency is less critical.

As well as, we’ve got released Gemma Scope, which permits you to discover and track individual features of the neural network, allowing you to explore the internals of the Gemma 2 model.
Gemma Scope uses sparse autoencoders (SAEs) to extract interpretable features from models and explore their inner workings. Gemma Scope says developers can construct more comprehensible, accountable, and trustworthy AI systems.
On this regard, Google announced last week the ‘JumpReLU SAE’ architecture, which may explore the inside of a model using SAE.
Reporter Park Chan cpark@aitimes.com