logoalt Hacker News

esafakyesterday at 9:06 PM1 replyview on HN

It's not 'emergent' in the sense that it just happens; it's a byproduct of human feedback, and it can be neutralized.


Replies

cortesoftyesterday at 9:44 PM

But isn’t the problem that if an LLM ‘neutralizes’ its sycophantic responses, then people will be driven to use other LLMs that don’t?

This is like suggesting a bar should help solve alcoholism by serving non-alcoholic beer to people who order too much. It won’t solve alcoholism, it will just make the bar go out of business.

show 2 replies