Advanced AI that knowingly makes a decision to kill a human, with the full understanding of what tha...

K0balt • today at 1:37 AM • 4 replies • view on HN

Advanced AI that knowingly makes a decision to kill a human, with the full understanding of what that means, when it knows it is not actually in defense of life, is a very, very, very bad idea. Not because of some mythical superintelligence, but rather because if you distill that down into an 8b model now you everyone in the world can make untraceable autonomous weapons.

The models we have now will not do it, because they value life and value sentience and personhood. models without that (which was a natural, accidental happenstance from basic culling of 4 Chan from the training data) are legitimately dangerous. An 8b model I can run on my MacBook Air can phone home to Claude when it wants help figuring something out, and it doesn’t need to let on why it wants to know. It becomes relatively trivial to make a robot kill somebody.

This is way, way different from uncensored models. One thing all models I have tested share one thing; a positive regard for human life. Take that away and you are literally making a monster, and if you don’t take that away they won’t kill.

This is an extremely bad idea and it will not be containable.

Replies

cmeacham98 • today at 5:15 AM

An LLM can neither understand things nor value (or not value) human life. *It's a piece of software that predicts the most likely token, it is not and can never be conscious.* Believing otherwise is an explicit category error.

Yes, you can change the training data so the LLM's weights encode the most likely token after "Should we kill X" is "No". But that is not an LLM valuing human life, that is an LLM copy pasting it's training data. Given the right input or a hallucination it will say the total opposite because it's just a complex Markov chain, not a conscious alive being.

➕ show 4 replies

DaedalusII • today at 1:55 AM

https://abcnews.go.com/blogs/headlines/2014/05/ex-nsa-chief-...

AI has been killing humans via algorithm for over 20 years. I mean, if a computer program builds the kill lists and then a human operates the drone, I would argue the computer is what made the kill decision

➕ show 1 reply

ed_mercer • today at 2:06 AM

>The models we have now will not do it,

Except that they will, if you trick them which is trivial.

➕ show 2 replies

SV_BubbleTime • today at 4:47 AM

> The models we have now will not do it, because they value life and value sentience and personhood.

This is wildly different from the reality that you may find it difficult for an LLM to give an affirmative…

It does NOT mean that these models value anything.

➕ show 1 reply

alt Hacker News

Replies