Google’s upgrade breaks reasoning barriers

-

Good morning, { AI enthusiasts }. OpenAI and Anthropic have been grabbing all of the 2026 headlines — but Google just reminded everyone why it’s still the largest powerhouse within the AI race.

With an upgraded Deep Think obliterating benchmarks across math, coding, and science, and a brand new research agent autonomously solving open problems, the tech giant is pushing frontier AI for scientific research into uncharted territory.

In today’s AI rundown:

  • Google’s Deep Think crushes reasoning benchmarks

  • OAI launches ultra-fast coding model on Cerebras chips

  • The way to generate a TV industrial with AI

  • MiniMax’s open-source M2.5 hits frontier coding levels

  • 4 latest AI tools, community workflows, and more

LATEST DEVELOPMENTS

GOOGLE

Image source: Google

The Rundown: Google just released a serious update to its Gemini 3 Deep Think reasoning mode, posting dominant scores across math, coding, and science — while also introducing its Olympiad-level math research agent driven by the brand new upgrade.

The small print:

  • Deep Think hit 84.6% on ARC-AGI-2, obliterating Opus 4.6 (68.8%) and GPT-5.2 (52.9%), and set a brand new high of 48.4% on Humanity’s Last Exam.

  • It also reached gold-medal marks on the 2025 Physics & Chemistry Olympiads and scored a 3,455 Elo on Codeforces, nearly 1,000 points above Opus 4.6.

  • Google also unveiled Aletheia, a math agent that autonomously solves open problems, verifies proofs, and hits latest highs across domain benchmarks.

  • The Deep Think upgrade is live for Google AI Ultra subscribers within the Gemini app, with API access open to researchers via an early access program.

Why it matters: After Google dominated benchmarks and headlines to shut 2025, the main focus has been more on Anthropic and OpenAI in 2026 — but don’t forget concerning the tech giant as arguably the largest powerhouse within the AI race. Deep Think’s scores are wild, and the frontier for math and science is quickly moving into uncharted territory.

TOGETHER WITH VOXEL51

The Rundown: Most teams are labeling massive amounts of knowledge that never gets used for model training. Voxel51’s technical workshop on Feb. 18 shows methods to construct feedback-driven annotation pipelines that eliminate over-labeling — saving money and time while improving model performance.

Join the workshop and learn:

  • The way to use zero-shot selection and embeddings for max cost savings

  • QA workflows to review specific objects and fix errors fast

  • The way to implement dedicated test sets to catch label drift early

  • Debugging with embeddings to visualise the clusters confusing your model

OPENAI

Image source: OpenAI

The Rundown: OpenAI released GPT-5.3-Codex-Spark, a brand new speed-optimized coding model that runs on Cerebras hardware, cranking out 1,000+ tokens per second and marking the corporate’s first AI product powered by chips beyond its Nvidia stack.

The small print:

  • Spark trades intelligence for speed, trailing the total 5.3-Codex on SWE-Bench Pro and Terminal-Bench but ending tasks in a fraction of the time.

  • The discharge comes just weeks after OAI inked a $10B+ cope with Cerebras and separate agreements with AMD and Broadcom, diversifying away from Nvidia.

  • OAI’s vision is for Spark to handle quick interactive edits while the total Codex tackles longer autonomous tasks within the background.

  • The model is rolling out as a research preview for ChatGPT Pro subs, with API access initially limited to a handful of enterprise design partners.

Why it matters: Codex’s principal criticism has been its speed, and OpenAI just addressed it in a giant way — while making its chip diversification play real with the primary product built on Cerebras hardware. Real-time coding with fast feedback will certainly change workflows for development tasks which might be in a position to compromise a little bit of power for speed.

AI TRAINING

The Rundown: On this guide, you’ll learn to generate a 20-second ad within the sort of knowledgeable TV industrial — taking the guesswork out of outputs while not having to click and pray.

Step-by-step:

  1. Consider a industrial idea and ask Gemini to plan out two 5s scenes. Once done, ask it to jot down prompts for the beginning and end frames of each scenes.

  2. Now, log in to Higgsfield (you’ll need a basic/pro plan) and click on Image > Create Image > Nano Banana Pro. Set 4k quality, 4 variations, and 21:9 ratio.

  3. Generate the beginning + end frame for scene 1 and just the top frame for scene 2. Download those you want best.

  4. In Higgsfield, go to Video > Kling 3.0, upload your frames with the short scene prompt, and hit generate. After this, stitch the videos in a free editor.

Pro tip: Ask Gemini to make use of photography terms like “Hero shot” when generating scene prompts. It’s also possible to generate music for the ad with Suno + Eleven Labs.

PRESENTED BY CDATA

The Rundown: Microsoft and CData are teaming up for a live 45-minute session on methods to design secure, scalable agentic infrastructure using Copilot Studio, Agent 365, and CData’s Connect AI — including a live cross-system workflow demo.

On this session, you will learn:

  • How Microsoft and CData deliver connectivity, context, and control for production AI agents

  • Agent design and production best practices from each teams

  • How a Copilot Studio agent syncing with Salesforce and Dynamics 365 is built and deployed

Register here for the session. All registrants will receive the session recording.

MINIMAX

Image source: MiniMax

The Rundown: Chinese AI lab MiniMax launched M2.5, an open-source model that rivals Opus 4.6 and GPT-5 on agentic coding benchmarks — but at a fraction of the price, making it low-cost enough to power AI agents running across the clock.

The small print:

  • M2.5 shows especially strong coding performance, scoring roughly even with Opus 4.6 and GPT-5.2 across key development benchmarks.

  • Two APIs can be found: a faster M2.5-Lightning ($2.40/M output) and a typical M2.5 ($1.20/M output), each priced much lower than Opus ($25/M).

  • MiniMax revealed that M2.5 now handles 30% of day by day company tasks across R&D, product, sales, HR, and finance, in addition to 80% of recent code commits.

  • The models can be found via API, though the open-source weights and license have yet to be published.

Why it matters: Every few months, it seems like a Chinese lab drops a model that changes the price math for all the industry. M2.5’s frontier-level coding at this price makes “intelligence too low-cost to meter” feel closer than ever, a very important development as agents handling longer autonomous tasks turn into more common.

QUICK HITS

  • 🔒 Incogni – remove your personal data from the online so scammers and identity thieves can’t access it. Use code RUNDOWN to get 55% off.*

  • 🧠 Gemini 3 Deep Think – Google’s upgraded AI reasoning mode

  • ⚡️ GPT-5.3-Codex-Spark – OpenAI’s ultra-fast model for real-time coding

  • 🤖 M2.5 – Minimax’s latest open-source frontier model with powerful coding

ByteDance officially launched Seedance 2.0, the corporate’s viral SOTA video model, publishing benchmark results and a technical blog, but access still stays restricted.

Mustafa Suleyman told FT that almost all white-collar work will likely be “fully automated by AI inside 12 to 18 months,” with Microsoft pursuing “true self-sufficiency” with its models.

Elon Musk said that xAI’s wave of exits was forced, not voluntary — calling it a reorg for “speed of execution” after losing ten co-founders and engineers this week.

OpenAI is retiring GPT-4o, GPT-4.1, and o4-mini from ChatGPT today, coming amid pushback from users calling for 4o’s preservation.

Anthropic officially announced a brand new $30B funding round at a $380B valuation, with its revenue run rate hitting $14B — $2.5B of which comes from Claude Code alone.

OAI researcher Zoë Hitzig resigned after the launch of ChatGPT ads, warning OAI’s archive of human thought creates “unprecedented potential for manipulation.”

COMMUNITY

Every newsletter, we showcase how a reader is using AI to work smarter, save time, or make life easier.

Today’s workflow comes from reader Anthony H. in Australia:

“I needed a QR code scanner to examine in our members at regular meetings for use on an iPad. I could not find solution that wasn’t expensive or bloated with extra features not needed. So, I created my very own with Google AI Studio, GitHub, and Vercel.

It features event session creation, member profiles, auto-create custom QR codes for every member, and a system backup, as the info is held locally as a consequence of privacy. I added bulk import and export functions. Reports may be created that we’d like for our funding requirements as well.”

How do you employ AI? Tell us here.

That is it for today!

Before you go we’d like to know what you considered today’s newsletter to assist us improve The Rundown experience for you.
  • ⭐️⭐️⭐️⭐️⭐️ Nailed it
  • ⭐️⭐️⭐️ Average
  • ⭐️ Fail

Login or Subscribe to participate

See you soon,

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x