logoalt Hacker News

dchichkov01/22/20251 replyview on HN

Sorry, but this was ChatGPT/o1 with access to code execution (Python) and it used almost 4 minutes to do reasoning. It had done a few checks with smaller numbers, all of which had failed. And it proceeded to make a wrong conclusion (with high confidence).


Replies

bongodongobob01/22/2025

Of course it failed. Tell it to write a program.