NVIDIA-Accelerated Mistral 3 Open Models Deliver Efficiency, Accuracy at Any Scale

The brand new Mistral 3 open model family delivers industry-leading accuracy, efficiency, and customization capabilities for developers and enterprises. Optimized from NVIDIA GB200 NVL72 to edge platforms, Mistral 3 includes:

One large state-of-the-art sparse multimodal and multilingual mixture of experts (MoE) model with a complete parameter count of 675B
A set of small, dense high-performance models (called Ministral 3) of sizes 3B, 8B, and 14B, each with Base, Instruct, and Reasoning variants (nine models total)

All of the models were trained on NVIDIA Hopper GPUs and at the moment are available through Mistral AI on Hugging Face. Developers can pick from quite a lot of options for deploying these models on different NVIDIA GPUs with different model precision formats and open source framework compatibility (Table 1).

	Mistral Large 3	Ministral-3-14B	Ministral-3-8B	Ministral-3-3B
Total parameters	675B	14B	8B	3B
Lively parameters	41B	14B	8B	3B
Context window	256K	256K	256K	256K
Base	–	BF16	BF16	BF16
Instruct	–	Q4_K_M, FP8, BF16	Q4_K_M, FP8, BF16	Q4_K_M, FP8, BF16
Reasoning	Q4_K_M, NVFP4, FP8	Q4_K_M, BF16	Q4_K_M, BF16	Q4_K_M, BF16
Frameworks
vLLM	✔	✔	✔	✔
SGLang	✔	–	–	–
TensorRT-LLM	✔	–	–	–
Llama.cpp	–	✔	✔	✔
Ollama	–	✔	✔	✔
NVIDIA hardware
GB200 NVL72	✔	✔	✔	✔
Dynamo	✔	✔	✔	✔
DGX Spark	✔	✔	✔	✔
RTX	–	✔	✔	✔
Jetson	–	✔	✔	✔

Table 1. Mistral 3 model specifications

NVIDIA-Accelerated Mistral 3 Open Models Deliver Efficiency, Accuracy at Any Scale

Mistral Large 3 delivers best-in-class performance on NVIDIA GB200 NVL72

NVFP4 quantization

Open source inference

Ministral 3 models deliver speed, versatility, and accuracy

Production-ready deployment with NVIDIA NIM

Start constructing with open source AI

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

The Current Status of The Quantum Software Stack

The Multi-Agent Trap

A Tale of Two Variances: Why NumPy and Pandas Give Different Answers

How Vision Language Models Are Trained from “Scratch”

Why Care About Prompt Caching in LLMs?

NVIDIA-Accelerated Mistral 3 Open Models Deliver Efficiency, Accuracy at Any Scale

Mistral Large 3 delivers best-in-class performance on NVIDIA GB200 NVL72

NVFP4 quantization

Open source inference

Ministral 3 models deliver speed, versatility, and accuracy

Production-ready deployment with NVIDIA NIM

Start constructing with open source AI

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.