This is funny because it’s a silly topic, but I think it shows something extremely seriously wrong w...

iterateoften • today at 4:08 AM • 5 replies • view on HN

This is funny because it’s a silly topic, but I think it shows something extremely seriously wrong with llms.

The goblins stand out because it’s obvious. Think of all the other crazy biases latent in every interaction that we don’t notice because it’s not as obvious.

Absolutely terrifying that OpenAI is just tossing around that such subtle training biases were hard enough to contain it had to be added to system prompt.

Replies

ninjagoo • today at 4:13 AM

> Absolutely terrifying that OpenAI is just tossing around that such subtle training biases were hard enough to contain it had to be added to system prompt.

May I introduce you to homo sapiens, a species so vulnerable to such subtle (or otherwise) biases (and affiliations) that they had to develop elaborate and documented justice systems to contain the fallouts? :)

➕ show 2 replies

snakebiteagain • today at 5:01 AM

Mandatory reading on that topic: www.anthropic.com/research/small-samples-poison

We're probably not noticing a LOT of malicious attempts at poisoning major AI's only because we don't know what keywords to ask (but the scammers do and will abuse it).

tptacek • today at 4:22 AM

I think it's extraordinarily telling that people are capable of being reflexively pessimistic in response to the goblin plague. It's like something Zitron would do.

This story is wonderful.

➕ show 1 reply

bitexploder • today at 4:36 AM

We do not have the complete picture.

ordinarily • today at 4:14 AM

Doesn't seem that surprising or terrifying to me. Humans come equipped with a lot more internal biases (learned in a fairly similar fashion), and they're usually a lot more resistant to getting rid of them.

The truly terrifying stuff never makes it out of the RLHF NDAs.

➕ show 2 replies

alt Hacker News

Replies