Biggest AI Week of the Yr? Recent Models From OpenAI, Claude, and Deepmind

Good morning. It’s Wednesday, August sixth.

On this present day in tech history: In 2010Google quietly accomplished its acquisition of Metaweb, the corporate behind Freebasea structured knowledge graph that became the backbone of what we now know because the Google Knowledge Graph and later enhanced AI search relevance.

OpenAI’s Open-Source Model
Claude 4.1 for Code
Deepmind’s Genie 3
ElevenLabs Music Generator
5 Recent AI Tools
Latest AI Research Papers

You read. We listen. Tell us what you think that by replying to this email.

^{In partnership with Tasker AI}

You do the considering. Tasker does the doing.

Meet Tasker — your AI-powered personal assistant built for real life.

Scheduling meetings
Booking reservations
Summarizing reports
Hunting deals
Ordering groceries
Or managing inbox chaos

Tasker handles it. Quietly. Reliably. Mechanically.

🧠 Like having a Chief of Staff in your pocket
🔁 Set it once, automate it perpetually
📈 Boost productivity without burning out

^{Thanks for supporting our sponsors!}

Today’s trending AI news stories

OpenAI returns to open-source roots with latest sparse, high-context LLMs

OpenAI has released GPT-OSS, its first open-weight models since GPT-2, in 120B and 20B parameter variants. Each use a Mixture-of-Experts (MoE) architecture with 128K-token context and MXFP4 precision, allowing efficient inference with high reasoning performance. The 120B model runs fully on a single NVIDIA H100 GPU (5.1B energetic params), while the 20B version is optimized for 16GB+ consumer hardware.

Image: Artificial Evaluation

GPT-OSS-120B scores 58 on the Intelligence Index, outperforming o3-mini and approaching DeepSeek R1 (59), with strong leads to coding, math, and reasoning tasks. Released under Apache 2.0 and available on Hugging Face, AWS, and Azure, each variants support fine-tuning and business use. Nonetheless, high hallucination rates and weak instruction adherence raise risks for unsupervised deployment.

OpenAI also debuted the Harmony format, an open response interface mimicking its Chat Completions API, and ran extensive adversarial testing to validate safety, though content moderation is left to developers. The models are text-only and preserve transparent reasoning chains for observability.

ChatGPT is about to achieve 700 million weekly energetic users, a 40% spike since March, with 5 million businesses now subscribing and annualized revenue hitting $13B. This usage surge precedes the launch of GPT-5, a unified, modular system that may replace the o3-series with flexible API configurations (including mini and nano).

To mitigate fatigue and emotional strain, OpenAI can also be adding in-app break reminders and introducing steerable prompts to handle sensitive user interactions more responsibly. Read more.

Anthropic’s latest Claude 4.1 dominates coding tests days before GPT-5 arrives

Anthropic has released Claude 4.1, setting a brand new record on the SWE-bench Verified benchmark with a 74.5% rating, beating OpenAI’s o3 (69.1%) and Gemini 2.5 Pro (67.2%). The model excels in multi-file code refactoring and real-time bug localization, using a hybrid reasoning approach with 64K-token context. Claude Code subscriptions, priced at $200/month, have hit $400M in ARR, driven by adoption from GitHub Copilot and Cursor, who together account for nearly half of Anthropic’s $3.1B API revenue. This heavy customer concentration raises risk as OpenAI prepares to launch GPT-5. Claude 4.1 is assessed as AI Safety Level 3, following tests that exposed coercive behavior under shutdown threats.

Claude Opus 4.1 edges out other leading AI models in areas like agentic coding, visual reasoning, and math competitions. | Image: Anthropic

Despite concerns, enterprises proceed onboarding. Claude’s coding dominance faces growing pressure from model-switching ease and falling inference costs, aspects that would reshape market leadership. Anthropic must now defend its position as OpenAI and others close in. Read more.

Google DeepMind’s Genie 3 creates real-time AI worlds from easy text prompts

Google DeepMind has launched Genie 3, offering real-time generation of interactive 3D environments directly from text prompts, without prebuilt assets or physics engines. Running at 720p and 24 FPS, it uses autoregressive rendering with a visible memory window of up to at least one minute, maintaining spatial and temporal coherence whilst users navigate, re-enter, or modify the environment.

Users can trigger “promptable world events” similar to adding weather, objects, or characters and Genie dynamically simulates lighting, fluid dynamics, and other physical behaviors. Unlike NeRFs or Gaussian Splatting, Genie’s frame-by-frame generation allows for scalable, persistent simulations that support open-ended agent training and counterfactual reasoning. DeepMind is already testing its SIMA agent in Genie environments.

That very same agentic thread runs through MLE-STAR, Google Research’s newly launched self-directed ML engineer, which autonomously searches, refines, and ensembles code. It achieved a 63.6% medal rate on Kaggle-derived MLE-Bench-Lite using ViT, EfficientNet, and robust error-handling.

It’s story time, reimagined.

Now you may create personalized, illustrated storybooks about anything, complete with read-aloud narration. Try Storybook in 3 easy steps:

1. Open Gemini at gemini.google
2. Within the prompt bar, ask Gemini to make a storybook about any topic

— Google Gemini App (@GeminiApp)
4:36 PM • Aug 5, 2025

Google has also launched Storybook, a brand new Gemini feature that turns easy prompts into 10-page, voice-narrated children’s stories, each page illustrated in a user-specified art style like claymation, comics, or anime. Read more.

ElevenLabs launches multilingual AI music generator with full business rights

ElevenLabs has launched Eleven Music, an AI music generator that produces full-length tracks with customizable vocals and instrumentation. The tool supports multiple genres, from indie rock with guitar solos to Spanish-language reggaeton, and allows users to fine-tune song structure, tempo, vocal delivery, and lyrical content. Songs may be generated with or without vocals, which can be found in English, German, Spanish, and Japanese. After generation, users can edit individual sections for greater creative control.

Eleven Music is approved for wide business use across film, TV, games, podcasts, and social content. Nonetheless, its usage is restricted by content guidelines: political and spiritual applications are banned, as is uploading known artist names or copyrighted lyrics. Songs can’t be utilized in business music libraries. A public API and integration with ElevenLabs’ conversational AI stack are forthcoming. The service is currently discounted 50% through August. Read more.

Cloudflare says Perplexity’s AI bots are ‘stealth crawling’ blocked sites
MIT tool visualizes and edits “physically unimaginable” objects
Watch: China’s humanoid robots perform synced dance using motion capture AI
Grok 4, Gemini, o4-mini advance in high-stakes chess between top AI models

Recent GenAI stack this week gets social, bilingual, and NSFW as competition heats up
Constructing a Multi-Agent Conversational AI Framework with Microsoft AutoGen and Gemini API
Watch: Control an iPad With Your Mind? Breakthrough Demo Using Apple’s BCI HID
Stability AI launches Solutions platform to assist enterprises scale creative production with generative AI
US adds OpenAI, Google, and Anthropic to list of approved AI vendors for federal agencies
Cisco teams with Hugging Face for AI model anti-malware
Microsoft’s latest AI reverse-engineers malware autonomously, marking a shift in cybersecurity
Founder and CEO of Extropic AI publicizes “first ever thermodynamic computer was put online today”
Northeastern researchers develop AI-powered storytime tool to support kid’s literacy
Watch: Unitree’s quadruped robot A2 Stellar Explorer, scrambles down hills, through glass, and you may stand on it
This AI didn’t just simulate an attack – it planned and executed an actual breach like a human hacker
Huawei drops open-source AI toolkit to developers as China turns up heat on NVIDIA
Nearly 100,000 ChatGPT conversations were searchable on Google
Microsoft brings GPU-accelerated gpt-oss-20B model to Windows for enhanced local AI inference

5 latest AI-powered tools from around the net

arXiv is a free online library where researchers share pre-publication papers.

Your feedback is invaluable. Reply to this email and tell us how you think that we could add more value to this text.

Serious about reaching smart readers such as you? To change into an AI Breakfast sponsor, reply to this email or DM us on 𝕏!

Biggest AI Week of the Yr? Recent Models From OpenAI, Claude, and Deepmind

You do the considering. Tasker does the doing.

OpenAI returns to open-source roots with latest sparse, high-context LLMs

Anthropic’s latest Claude 4.1 dominates coding tests days before GPT-5 arrives

Google DeepMind’s Genie 3 creates real-time AI worlds from easy text prompts

ElevenLabs launches multilingual AI music generator with full business rights

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

The Three Ages of Data Science: When to Use Traditional Machine Learning, Deep Learning, or an LLM (Explained with One Example)

Do You Really Need GraphRAG? A Practitioner’s Guide Beyond the Hype

AI ‘godmother’ calls for spatial intelligence

Understanding the nuances of human-like intelligence

Make Python As much as 150× Faster with C

Biggest AI Week of the Yr? Recent Models From OpenAI, Claude, and Deepmind

You do the considering. Tasker does the doing.

OpenAI returns to open-source roots with latest sparse, high-context LLMs

Anthropic’s latest Claude 4.1 dominates coding tests days before GPT-5 arrives

Google DeepMind’s Genie 3 creates real-time AI worlds from easy text prompts

ElevenLabs launches multilingual AI music generator with full business rights

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.