logoalt Hacker News

andrelaszloyesterday at 7:46 PM1 replyview on HN

Could anyone explain to me what the problem is with this? I thought I was fairly up to date on these things, but this was a surprise to me. I see the sibling comment getting downvoted but I promise I'm asking this in good faith, even if it might seem like a silly question (?) for some reason.


Replies

alistairSHyesterday at 9:18 PM

From what I'm reading in other comments, the problem was Claude1 got increasingly "frustrated" with Claude2's inability to do whatever the human was asking, and started breaking it's own rules (using ALL CAPS).

Sort of like MS's old chatbot that turned into a Nazi overnight, but this time with one agent simply getting tired of the other agent's lack of progress (for some definition of progress - I'm still not entirely sure what the author was feeding into Claude1 alongside errors from Claude2).