logoalt Hacker News

bongodongobob01/21/20251 replyview on HN

Can we stop with the "haha llms can't do math" nonsense? You'll one shot it every time if you tell it to use Python. You're holding it wrong.


Replies

dchichkov01/22/2025

Sorry, but this was ChatGPT/o1 with access to code execution (Python) and it used almost 4 minutes to do reasoning. It had done a few checks with smaller numbers, all of which had failed. And it proceeded to make a wrong conclusion (with high confidence).

show 1 reply