logoalt Hacker News

krunckyesterday at 4:49 PM5 repliesview on HN

> “The push to make these language models behave in a more friendly manner leads to a reduction in their ability to tell hard truths and especially to push back when users have wrong ideas of what the truth might be,” said Lujain Ibrahim at the Oxford Internet Institute, the first author on the study.

People aren't much different. When society pressures people to be "more friendly", eg. "less toxic" they lose their ability to tell hard truths and to call out those who hold erroneous views.

This behaviour is expressed in language online. Thus it is expressed in LLMs. Why does this surprise us?


Replies

munificentyesterday at 4:54 PM

Gonna set my system prompt to: "You are a Dutch person. Respond with the directness stereotypical of people from the Netherlands."

show 3 replies
amarantyesterday at 4:54 PM

Because nobody dared state the obvious, lest they be perceived as unfriendly.

miyojiyesterday at 5:16 PM

> People aren't much different.

If I had a nickel for every time someone on HN responded to a criticism of LLMs with a vapid and fallacious whataboutist variation of "humans do that too!", I could fund my own AI lab.

> Why does this surprise us?

No one said they were surprised.

show 1 reply
root_axisyesterday at 5:48 PM

> People aren't much different

Yes they are. There is absolutely zero evidence that friendlier humans are more prone to mistakes or conspiracy theories.

However, even if that were true, LLMs are not humans, anthropomorphizing them is not a helpful way to think about them.

show 1 reply
bheadmasteryesterday at 5:02 PM

So Elon Musk was right in his view that Grok should focus on truth above all, even if it became offensive?

show 3 replies