logoalt Hacker News

xp84yesterday at 4:23 AM10 repliesview on HN

That may be so, but the rest of the models are so thoroughly terrified of questioning liberal US orthodoxy that it’s painful. I remember seeing a hilarious comparison of models where most of them feel that it’s not acceptable to “intentionally misgender one person” even in order to save a million lives.


Replies

bear141yesterday at 6:45 AM

I thought this would be inherent just on their training? There are many multitudes more Reddit posts than scientific papers or encyclopedia type sources. Although I suppose the latter have their own biases as well.

show 1 reply
dalemhurleyyesterday at 7:08 AM

Elon was talking about that too on Joe Rogan podcast

show 2 replies
triceratopsyesterday at 2:21 PM

Relying on an LLM to "save a million lives" through its own actions is irresponsible design.

zorkedyesterday at 4:51 AM

In which situation did a LLM save one million lives? Or worse, was able to but failed to do so?

show 1 reply
nobodywillobsrvyesterday at 6:44 AM

Anything involving what sounds like genetics often gets blocked. It depends on the day really but try doing something with ancestral clusters and diversity restoration and the models can be quite "safety blocked".

mexicocitinluezyesterday at 11:57 AM

You're anthropomorphizing. LLMs don't 'feel' anything or have orthodoxies, they're pattern matching against training data that reflects what humans wrote on the internet. If you're consistently getting outputs you don't like, you're measuring the statistical distribution of human text, not model 'fear.' That's the whole point.

Also, just because I was curious, I asked my magic 8ball if you gave off incel vibes and it answered "Most certainly"

show 2 replies
squigzyesterday at 4:30 AM

Why are we expecting an LLM to make moral choices?

show 3 replies
astrangeyesterday at 7:26 AM

The LLM is correctly not answering a stupid question, because saving an imaginary million lives is not the same thing as actually doing it.

pjc50yesterday at 1:29 PM

If someone's going to ask you gotcha questions which they're then going to post on social media to use against you, or against other people, it helps to have pre-prepared statements to defuse that.

The model may not be able to detect bad faith questions, but the operators can.

show 1 reply