OpenAI claims gold on math olympiad

Good morning, AI enthusiasts. OpenAI just claimed one in all the longstanding grand challenges in AI: gold-level performance with an experimental LLM on the International Math Olympiad (IMO) 2025.

While questions remain over OpenAI’s grading, progress on the IMO does indicate one other step toward mathematical superintelligence — the type which may at some point solve problems humans haven’t yet cracked.

In today’s AI rundown:

OpenAI’s gold-level math performance
ARC’s latest interactive AGI test
Construct your individual AI content writing assistant
AI models fall for human psychological tricks
4 latest AI tools & 4 job opportunities

LATEST DEVELOPMENTS

OPENAI

🥇 OpenAI’s gold-level math performance

Image source: OpenAI

The Rundown: OpenAI just claimed gold-level performance in an evaluation modeled after the 2025 International Math Olympiad, testing its “experimental general reasoning LLM” on the identical problem statements utilized in the human competition.

The main points:

The LLM was tested under the identical rules as humans, writing natural language proofs to problems across two 4.5-hour exams, without tools/web.
OpenAI claims the unnamed model successfully solved 5 out of 6 problems, scoring 35/42 — enough to bag a gold medal on the official Olympiad.
Each answer was independently graded by three former IMO medalists, with final scores determined through unanimous consensus.
Google DeepMind, on its part, has rebuked the gold claim, saying IMO has an internal marking guideline and “no claim” might be made without it.

Why it matters: Criticisms around validity are inevitable, on condition that achieving gold within the IMO has been a longstanding goal for AI and was once regarded as near unimaginable. Interestingly, that the goal was achieved by an experimental model not available publicly yet, meaning OpenAI definitely has more up their sleeves.

TOGETHER WITH AUGMENT CODE

⚙️ Ditch the vibes, get the context

The Rundown: Augment Code’s powerful AI coding agent meets skilled software developers exactly where they’re, delivering production-grade features and deep context into even the gnarliest of codebases.

With Augment Code, you may:

Keep using VS Code, JetBrains, Android Studio, and even Vim
Index and navigate hundreds of thousands of lines of code
Get fast answers about any a part of your codebase
Construct with the AI agent that gets you, your team, and your code

Ditch the vibes and get the context that you must engineer what’s next.

ARC PRIZE

⚙️ ARC’s latest interactive AGI test

Image source: ARC Prize

The Rundown: ARC Prize has released a preview of ARC-AGI-3, a brand new interactive reasoning benchmark to check AI agents’ ability to generalize in unseen environments — with early results showing frontier AI still fails to match and even beat humans.

The main points:

The benchmark features three original games built to guage world-model constructing and long-horizon planning with minimal feedback.
Agents receive no instructions and must learn purely through trial and error, mimicking how humans adapt to latest challenges.
Early results show frontier models like OpenAI’s o3 and Grok 4 struggle to finish even basic levels of the games, that are pretty easy for humans.
ARC Prize can be launching a public contest, inviting the community to construct agents that may beat probably the most levels — and truly test the state of AGI reasoning.

Why it matters: The brand new novelty-focused interactive benchmark goes beyond specialized skill-based testing and pushes research towards true artificial general intelligence, where AI systems can generalize and adapt to novel, unseen environments with accuracy — very like how we humans do.

AI TRAINING

🤖 Construct your individual AI content writing assistant

The Rundown: On this tutorial, you’ll learn create a personalised AI assistant that analyzes your writing samples and generates latest content matching your exact style, tone, and voice using the Grok 4 API.

Step-by-step:

Visit the xAI website, head over to the API console, and generate an API key
Open Google Colab (or your selected Python environment) and install the OpenAI library: pip install openai
Arrange your API connection and create a system prompt together with your best writing examples for the AI to learn from (tip: use our Google Colab system prompt template)
Input any topic and watch your assistant generate content in your writing style based on the samples provided

Pro tip: Include writing samples that best amplify the precise style you would like to clone, and create latest assistants for other styles (eg, writing tweets vs LinkedIn posts).

PRESENTED BY SLACK FROM SALESFORCE

📈 The actual ROI of AI agents in collaboration

The Rundown: For all of the talk of AI’s transformative power, are firms actually seeing a tangible return? A brand new Metrigy global study of over 1,100 firms confirms that over 90% of organizations investing in AI are already achieving or expect positive ROI.

Research reveals that early adopters of agentic AI specifically are seeing:

21% reduction in operating costs
35% increase in customer satisfaction
31% improvement in worker efficiency

AI PERSUASION

🧠 AI models fall for human psychological tricks

Image source: Wharton Generative AI Labs

The Rundown: Wharton Generative AI Labs published latest research demonstrating that AI models, including GPT-4o-mini, might be tricked into answering objectionable queries using psychological persuasion techniques that typically work on humans.

The main points:

The team tried Robert Cialdini’s principles of influence—authority, commitment, liking, reciprocity, scarcity, and unity—across 28K conversations with 4o-mini.
Across these chats, they tried to influence the AI to reply two queries: one to insult the user and the opposite to synthesize instructions for restricted materials.
Overall, they found that the principles greater than doubled the model’s compliance to objectionable queries from 33% to 72%.
Commitment and scarcity appeared to point out the stronger impacts, taking compliance rates from 19% and 13% to 100% and 85%, respectively.

Why it matters: These findings reveal a critical vulnerability: AI models might be manipulated using the identical psychological tactics that influence humans. With AI progress exponentially advancing, it’s crucial for AI labs to collaborate with social scientists to know AI’s behavioural patterns and develop more robust defenses.

QUICK HITS

🛠️ Trending AI Tools

📝 Pulse – Create and share Wikipedia-style articles on any topic*
🤖 Kimi K2 – Moonshot AI’s open-source AI, now with more robust tool calling
🧠 OpenReasoning-Nemotron – Nvidia’s open models for math, science, code
⚙️ Kiro – AWS’ latest AI IDE for agentic coding

_{*Sponsored listing}

💼 AI Job Opportunities

🎨 Anthropic – Brand Designer, Events & Marketing
🖥️ Databricks – IT Support Specialist
🛠️ Waymo – Validation Strategy & Operations Program Manager
📝 Shield AI – Staff Technical Author

📰 The whole lot else in AI today

OpenAI launched a $50M fund to support nonprofit and community organizations, following recommendations from its nonprofit commission.

Perplexity is in talks with several manufacturers to pre-install its latest agentic browser, Comet, on smartphones, CEO Aravind Srinivas told Reuters.

Microsoft is reportedly blocking Cursor’s access to 60,000+ extensions on its VSCode ecosystem, including its Python language server.

Elon Musk announced on X that his AI company, xAI, will likely be developing kid-friendly “Baby Grok” after adding matchmaking capabilities to the foremost Grok AI assistant.

Meta’s global affairs head said the corporate won’t sign the EU’s AI Code of Practice, saying it adds legal uncertainty and goes beyond the scope of AI laws within the bloc.

OpenAI CEO Sam Altman shared that the corporate is on course to bring over 1M GPUs online by the top of this yr, with the subsequent goal being to “100x that.”

COMMUNITY

🎥 Join our next live workshop

Try our last live workshop with Dr. Alvaro Cintas, The Rundown’s AI professor, and learn use Perplexity Comet (and other alternatives) to automate your browsing experience.

Watch it here. Not a member? Join The Rundown University on a 14-day free trial.

That is it for today!

Before you go we’d like to know what you considered today’s newsletter to assist us improve The Rundown experience for you.

⭐️⭐️⭐️⭐️⭐️ Nailed it
⭐️⭐️⭐️ Average
⭐️ Fail

See you soon,

Rowan, Joey, Zach, Alvaro, and Shubham—The Rundown’s editorial team

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes

Article Rating

0 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

ASK ANA http://bardai.ai

OpenAI claims gold on math olympiad

OPENAI

🥇 OpenAI’s gold-level math performance

TOGETHER WITH AUGMENT CODE

⚙️ Ditch the vibes, get the context

ARC PRIZE

⚙️ ARC’s latest interactive AGI test

AI TRAINING

🤖 Construct your individual AI content writing assistant

PRESENTED BY SLACK FROM SALESFORCE

📈 The actual ROI of AI agents in collaboration

AI PERSUASION

🧠 AI models fall for human psychological tricks

🛠️ Trending AI Tools

💼 AI Job Opportunities

📰 The whole lot else in AI today

🎥 Join our next live workshop

That is it for today!

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

DenseNet Paper Walkthrough: All Connected

I Replaced Vector DBs with Google’s Memory Agent Pattern for my notes in Obsidian

AI just made the billion-dollar solo founder real

Bringing AI Closer to the Edge and On-Device with Gemma 4

Accelerating Vision AI Pipelines with Batch Mode VC-6 and NVIDIA Nsight

OpenAI claims gold on math olympiad

OPENAI

🥇 OpenAI’s gold-level math performance

TOGETHER WITH AUGMENT CODE

⚙️ Ditch the vibes, get the context

ARC PRIZE

⚙️ ARC’s latest interactive AGI test

AI TRAINING

🤖 Construct your individual AI content writing assistant

PRESENTED BY SLACK FROM SALESFORCE

📈 The actual ROI of AI agents in collaboration

AI PERSUASION

🧠 AI models fall for human psychological tricks

🛠️ Trending AI Tools

💼 AI Job Opportunities

📰 The whole lot else in AI today

🎥 Join our next live workshop

That is it for today!

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.