logoalt Hacker News

dist-epochtoday at 8:50 AM1 replyview on HN

All of these discussions of models being "nerfed" reminds me of discussions among audiophiles "this cable sounds so much better than this other one, it's night and day, ferrari versus honda civic"

Yet when you do blind tests they can't tell the difference between a $1000 cable and a $1 one.

I bet if you do blind tests between GPT-5.3, 5.4 and 5.5 most would struggle to tell them apart, yet they are certain that "5.5 was nerfed 1 week after release, it's so obvious, it was John Carmack, now it can barely write a for loop"


Replies

anentropictoday at 8:58 AM

Exactly this. And it's not really possible to do repeatable trials, it's all just vibes. People have very little awareness of their own cognitive biases.

show 1 reply