On the mathematical abilities of LLMs

Published

May 9, 2026

A very interesting new blog post by Tim Gowers on his recent experiences with ChatGPT 5.5 Pro. A headline: “I would judge the level of the result that ChatGPT found in under two hours to be that of a perfectly reasonable chapter in a combinatorics PhD.” Well worth reading.

Added. Of the comments to Gowers’ post, the most sensible I have so far seen is this one by Chad Topaz.


Daniel M Gessell comments: I’m retired and choose to avoid using ML generated AI software as much as I can. I’m neutral on it – it’s a hobby to try to solve problems without using what, to me, are “black box” solutions found through ML.

But it sure looks like more and more new proofs are going to be discovered with the help of ML based AI, and it makes me wonder about the future of Mathematics as an academic pursuit – and about the future of academic pursuits in general.

My suspicion is massive change is coming, but maybe not in my lifetime. If it’s not this “wave” of AI advancements, in a few decades another wave will come.

Márcio Palmares comments: Ian Stewart, prophetically, in 1975, in a book on modern mathematics (before we dismiss him for now writing popular science and recreational mathematics, let us remember that Saunders Mac Lane himself recommends Stewart’s book on Galois Theory among the references in his Algebra textbook):

“Even in pure mathematics the computer has scored some notable triumphs, especially in the study of finite groups. However, very few problems are suitable for computation; and even some of those that are would take too long to perform, even for today’s very fast machines (or tomorrow’s, for that matter).

The uses of computers are not confined to numerical problems. Computers have been programmed to play draughts (well) and chess (badly), to translate from one language to another (execrably), to compose music (of sorts) and poetry. Some of the recent advances in producing ‘intelligent’ machines are quite remarkable.

This brings me naturally to the oft-asked question ‘Can computers think?’ As Joad would have said, it all depends what you mean by ‘think’. As yet, the computer can perform some of the functions of the human brain faster and more accurately; others it cannot perform at all. But if we ask, ‘Is there something special about the way in which human beings think which in principle can never be performed by some kind of machine?’ then my personal opinion is that the answer is ‘No’. Certainly we cannot duplicate the functions of the brain at the present time; and it is fairly certain that the resemblance between the brain and existing computers is about as close as that between a cow and a milk-lorry. Our technology may well never get anywhere near making a truly ‘intelligent’ machine: the human brain may well be too stupid. But I don’t think there is any obstacle to the production of a machine which performs the functions of the human brain; not any logical obstacle such as prevents √2 from being rational or a man from lifting himself by his bootstraps; for the following reason: the human body is visibly a machine, in the sense that it composed out of matter and the components obey the same laws as other matter. It is a very complicated and wonderful machine which we don’t understand. If there were in principle an obstacle to the construction of machines which behaved like people, then there would be no people.

This is not to reduce humanity to the level of a can-opener. Many people insist that the complexities of human behaviour, the emotional, creative, and spiritual attributes, must be consequences of something ‘greater’ than physical laws. This is a wonderful concept. How much more wonderful it would be, however, if these very attributes were consequences of physical laws. Far from demeaning humanity, this would elevate physics!”

Concepts of Modern Mathematics, pp. 267-268. Dover Books.*M

Rowsety Moid comments: There’s now a more impressive result. Tim Gowers:

AI has now solved a major open problem — one of the best known Erdos problems called the unit distance problem, one of Erdos’s favourite questions and one that many mathematicians had tried.

Solved by finding a counterexample. That matters because one of the reasons human mathematicians didn’t find it may be that they thought the conjecture was true.

OpenAI:

The result is also notable for how it was found. The proof came from a new general purpose reasoning model, rather than from a system trained specifically for mathematics, scaffolded to search through proof strategies, or targeted at the unit distance problem in particular. As part of a broader effort to test whether advanced models can contribute to frontier research, we evaluated it on a collection of Erdős problems. In this case, it produced a proof resolving the open problem.

Useful comment from Thomas Bloom in the companion paper, Remarks on the Disproof of the Unit Distance Conjecture:

On examining the construction, it becomes more clear how people had missed this before – it requires the confluence of several different unlikely events: that a good mathematician is (1) spending significant time in thinking about the unit distance conjecture in the first place; (2) seriously trying to disprove it, despite the oft-repeated belief of Erdős that it is true; (3) believes that there is mileage in generalising the original construction to other number fields, and so is willing to expend significant time in exploring such constructions; and (4) sufficiently familiar with the relevant parts of class field theory to recognise that the appropriately phrased question about infinite towers of number fields with appropriate parameters can be solved using existing theory.*

The AI met all of these criteria, and its success here echoes previous achievements: it often produces the most surprising results by persevering down paths that a human may have dismissed as not worth their time to explore, combining superhuman levels of patience with familiarity with a vast array of technical machinery.

Daniel M Gessell comments: Márcio Palmares, thank you for sharing that quote from Ian Stewart – the questions it address are now being asked almost daily in major newspapers. And you’ve given me another book for my TBR pile. I also view biology as machinery – one day, physical devices we call “computers” may be literally grown in the factory, perhaps guided or seeded with the 3D printing approach being explored for synthetic organs.

From a formal computational perspective, it seems the human brain, augmented with an infinite lifetime and supply of pencils and paper, would be Turing Complete. Can it do more? Can the laws of physics, whatever they may truly be, compute the incomputable?

Rowsety Moid comments: More Erdos problems fall. Przemek Chojecki on Twitter:

Another 9 open Erdos problems solved, this time by DeepMind team.

Interesting loop of LLM – Lean agents working autonomously, and only after it’s verified formally, going through human review.

Paper: Advancing Mathematics Research with AI-Driven Formal Proof Search

I think this is less “intelligent” than the disproof of the Unit Distance Conjecture, because it looks like it relies more on search, generating proofs until it finds one that isn’t rejected by Lean. Also, the proofs thus found come already checked. I suspect that’s where we’re headed, with human mathematicians not checking AI-written proofs and instead just trying to understand them.