How social media encourages the worst of AI boosterism

-

Put your math hats on for a minute, and let’s take a have a look at what this beef from mid-October was about. It’s an ideal example of what’s mistaken with AI right away.

Bubeck was excited that GPT-5 looked as if it would have by some means solved plenty of puzzles referred to as Erdős problems.

Paul Erdős, probably the most prolific mathematicians of the twentieth century, left behind a whole lot of puzzles when he died. To assist keep track of which of them have been solved, Thomas Bloom, a mathematician on the University of Manchester, UK, arrange erdosproblems.com, which lists greater than 1,100 problems and notes that around 430 of them include solutions. 

When Bubeck celebrated GPT-5’s breakthrough, Bloom was quick to call him out. “It is a dramatic misrepresentation,” he wrote on X. Bloom explained that an issue isn’t necessarily unsolved if this website doesn’t list an answer. That simply means Bloom wasn’t aware of 1. There are tens of millions of mathematics papers on the market, and no person has read all of them. But GPT-5 probably has.

It turned out that as a substitute of coming up with latest solutions to 10 unsolved problems, GPT-5 had scoured the web for 10 existing solutions that Bloom hadn’t seen before. Oops!

There are two takeaways here. One is that breathless claims about big breakthroughs shouldn’t be made via social media: Less knee jerk and more gut check.

The second is that GPT-5’s ability to search out references to previous work that Bloom wasn’t aware of can also be amazing. The hype overshadowed something that ought to have been pretty cool in itself.

Mathematicians are very involved in using LLMs to trawl through vast numbers of existing results, François Charton, a research scientist who studies the applying of LLMs to mathematics on the AI startup Axiom Math, told me after I talked to him about this Erdős gotcha.

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x