A brand new open AI coding model is closing in on proprietary options

-



On Tuesday, French AI startup Mistral AI released Devstral 2, a 123 billion parameter open-weights coding model designed to work as a part of an autonomous software engineering agent. The model achieves a 72.2 percent rating on SWE-bench Verified, a benchmark that attempts to check whether AI systems can solve real GitHub issues, putting it among the many top-performing open-weights models.

Perhaps more notably, Mistral didn’t just release an AI model, it released a brand new development app called Mistral Vibe. It’s a command line interface (CLI) just like Claude Code, OpenAI Codex, and Gemini CLI that lets developers interact with the Devstral models directly of their terminal. The tool can scan file structures and Git status to keep up context across a complete project, make changes across multiple files, and execute shell commands autonomously. Mistral released the CLI under the Apache 2.0 license.

It’s all the time smart to take AI benchmarks with a big grain of salt, but we’ve heard from employees of the massive AI firms that they pay very close attention to how well models do on SWE-bench Verified, which presents AI models with 500 real software engineering problems pulled from GitHub issues in popular Python repositories. The AI must read the problem description, navigate the codebase, and generate a working patch that passes unit tests. While some AI researchers have noted that around 90 percent of the tasks within the benchmark test relatively easy bug fixes that experienced engineers could complete in under an hour, it’s certainly one of the few standardized ways to check coding models.

Similtaneously the larger AI coding model, Mistral also released Devstral Small 2, a 24 billion parameter version that scores 68 percent on the identical benchmark and may run locally on consumer hardware like a laptop with no Web connection required. Each models support a 256,000 token context window, allowing them to process moderately large codebases (although whether you think about it large or small may be very relative depending on overall project complexity). The corporate released Devstral 2 under a modified MIT license and Devstral Small 2 under the more permissive Apache 2.0 license.



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x