logoalt Hacker News

Davidzhengtoday at 7:36 PM2 repliesview on HN

"An internal scaffolded version of GPT‑5.2 then spent roughly 12 hours reasoning through the problem, coming up with the same formula and producing a formal proof of its validity."

When I use GPT 5.2 Thinking Extended, it gave me the impression that it's consistent enough/has a low enough rate of errors (or enough error correcting ability) to autonomously do math/physics for many hours if it were allowed to [but I guess the Extended time cuts off around 30 minute mark and Pro maybe 1-2 hours]. It's good to see some confirmation of that impression here. I hope scientists/mathematicians at large will be able to play with tools which think at this time-scale soon and see how much capabilities these machines really have.


Replies

mmaundertoday at 7:40 PM

Yes and 5.3 and the latest codex cli client is incredibly good across compactions. Anyone know the methodology they're using to maintain state and manage context for a 12 hour run? It could be as simple as a single dense document and its own internal compaction algrorithm, I guess.

show 1 reply
slopusilatoday at 8:26 PM

after those 30 min you can manually ask it again to continue working on the problem

show 1 reply