logoalt Hacker News

simonwtoday at 1:44 PM1 replyview on HN

That was true a year ago, I don't think it's true today. I can't remember the last time I saw Claude or ChatGPT confidently answer a question that they should have searched for instead.

If you watch their reasoning traces they often say things like "this is a well-known historical fact so I don't need to search for it", or more frequently they spit off a bunch of searches.


Replies

aftbittoday at 1:56 PM

Anecdotally, it still happens a ton to me. They also still make super simple logic errors that they immediately reverse when pressed. For example, I asked Opus 4.7 last night how to cool off my room without making it too humid inside (indoor temp 78°F, humidity 45%; outdoor temp 64°F, humidity 99%). It suggested opening a window and assured me that the humidity would not rise above around 60% which would still be comfortable. I asked it to justify that and it said:

>You're absolutely right about the humidity — I was sloppy with that aside. If you ventilate enough to meaningfully cool the room, you're replacing indoor air with outdoor air wholesale, and you'd converge on outdoor conditions: 64°F and near-100% RH. That's miserable. The 55-60% figure I tossed out was hand-wavy nonsense — it would only hold if you barely cracked the window and mixed a tiny fraction of outdoor air in. At any ventilation rate that actually cools, you're just moving outside air inside.