logoalt Hacker News

addandsubtracttoday at 1:44 AM1 replyview on HN

Why would the model know what "Claude-like" is?


Replies

valleyertoday at 2:12 AM

Well, in part because the phenomenon has been discussed on Web forums that (a) have at this point made their way back into training data and (b) are accessible in Web searches that the model can invoke. And in part because the model can "know" what its initial instinct is and "decide" to go against it.

show 2 replies