logoalt Hacker News

emoIItoday at 7:06 AM1 replyview on HN

Super interesting, I wonder if this research will cause them to actually change their llm, like turning down the ”desperation neurons” to stop Claude from creating implementations for making a specific tests pass etc.


Replies

bethekindtoday at 7:12 AM

They likely already have. You can use all caps and yell at Claude and it'll react normally, while doing do so with chatgpt scares it, resulting in timid answers

show 2 replies