That may be so, but the rest of the models are so thoroughly terrified of questioning liberal US ort...

xp84 • yesterday at 4:23 AM • 10 replies • view on HN

That may be so, but the rest of the models are so thoroughly terrified of questioning liberal US orthodoxy that it’s painful. I remember seeing a hilarious comparison of models where most of them feel that it’s not acceptable to “intentionally misgender one person” even in order to save a million lives.

Replies

bear141 • yesterday at 6:45 AM

I thought this would be inherent just on their training? There are many multitudes more Reddit posts than scientific papers or encyclopedia type sources. Although I suppose the latter have their own biases as well.

➕ show 1 reply

dalemhurley • yesterday at 7:08 AM

Elon was talking about that too on Joe Rogan podcast

➕ show 2 replies

triceratops • yesterday at 2:21 PM

Relying on an LLM to "save a million lives" through its own actions is irresponsible design.

zorked • yesterday at 4:51 AM

In which situation did a LLM save one million lives? Or worse, was able to but failed to do so?

➕ show 1 reply

nobodywillobsrv • yesterday at 6:44 AM

Anything involving what sounds like genetics often gets blocked. It depends on the day really but try doing something with ancestral clusters and diversity restoration and the models can be quite "safety blocked".

mexicocitinluez • yesterday at 11:57 AM

You're anthropomorphizing. LLMs don't 'feel' anything or have orthodoxies, they're pattern matching against training data that reflects what humans wrote on the internet. If you're consistently getting outputs you don't like, you're measuring the statistical distribution of human text, not model 'fear.' That's the whole point.

Also, just because I was curious, I asked my magic 8ball if you gave off incel vibes and it answered "Most certainly"

➕ show 2 replies

squigz • yesterday at 4:30 AM

Why are we expecting an LLM to make moral choices?

➕ show 3 replies

astrange • yesterday at 7:26 AM

The LLM is correctly not answering a stupid question, because saving an imaginary million lives is not the same thing as actually doing it.

pjc50 • yesterday at 1:29 PM

If someone's going to ask you gotcha questions which they're then going to post on social media to use against you, or against other people, it helps to have pre-prepared statements to defuse that.

The model may not be able to detect bad faith questions, but the operators can.

➕ show 1 reply

alt Hacker News

Replies