logoalt Hacker News

skybrianyesterday at 9:14 PM1 replyview on HN

Sort of. I'm not sure the consequences of training LLM's based on users' upvoted responses were entirely understood? And at least one release got rolled back.


Replies

the_afyesterday at 11:06 PM

I think the only thing that's unclear, and what LLM companies want to fine-tune, is how much sycophancy they want. Too much, like the article mentions, and it becomes grotesque and breaks suspension of disbelief. So they want to get it just right, friendly and supportive but not so grotesque people realize it cannot be true.