logoalt Hacker News

js8today at 7:54 AM1 replyview on HN

A very human thing to do is - not to tell us which model has failed like this! They are not all alike, some are, what I observe, order of magnitude better at this kind of stuff than others.

I believe how "neurotypical" (for the lack of a better word) you want model to be is a design choice. (But I also believe model traits such as sycophancy, some hallucinations or moral transgressions can be a side effect of training to be subservient. With humans it is similar, they tend to do these things when they are forced to perform.)


Replies

nialsetoday at 7:56 AM

Codex in this case. I didn't even think about mentioning it. I'll update the post if it's actually relevant. Which I guess it is.

EDIT: It's specifically GPT-5.4 High in the Codex harness.

show 2 replies