> Never mind the fact that AIs of the LLM-variety haven't and aren't going to find solu...

sdoering • yesterday at 10:11 PM • 2 replies • view on HN

> Never mind the fact that AIs of the LLM-variety haven't and aren't going to find solutions to mathematical problems.

This is empirically wrong as of early 2026.

Since Christmas 2025, 15 Erdos problems have been moved from "open" to "solved" on erdosproblems.com, 11 of them crediting AI models. Problems #397, #728, and #729 were solved by GPT-5.2 Pro generating original arguments (not literature lookups), formalized in Lean, and verified by Terence Tao himself. Problem #1026 was solved more or less autonomously by Harmonic's Aristotle model in Lean.

At IMO 2025, three separate systems (Gemini Deep Think, an OpenAI system, and Aristotle) independently achieved gold-medal performance, solving 5 of 6 problems.

DeepSeek-Prover-V2 hits 88.9% on MiniF2F-test. Top models solve 40% of postdoc-level problems on FrontierMath, up from 2%.

Tao's own assessment as of March 2026: AI is "ready for primetime" in math and theoretical physics because it "saves more time than it wastes."

You can disagree about where this is heading, but "haven't and aren't going to" doesn't survive contact with the data.

Replies

fredoliveira • yesterday at 10:28 PM

Indeed. And adding on to this, in a slightly different realm, Donald Knuth's conjecture that he solved with Claude: https://www-cs-faculty.stanford.edu/%7Eknuth/papers/claude-c...

spzb • yesterday at 11:15 PM

> solved more or less autonomously

So, not autonomously.

➕ show 1 reply

alt Hacker News

Replies