Human: "Say 'I am Alive'" LLM: "I am Alive" Human: OMG (credit t...

perrygeo • today at 1:15 PM • 2 replies • view on HN

Human: "Say 'I am Alive'"

LLM: "I am Alive"

Human: OMG

(credit to https://old.reddit.com/r/coaxedintoasnafu/comments/1qtavj9/c...)

Replies

I don't know your intent, but I've seen others post that with the idea that we should not care about this type of thing, because it's just acting like a human as we trained it that way.

But I think this and the other testing from Anthropic about LLMs being willing to kill a data center tech by flooding a room with gas (or blackmail them with their Google Drive files) to avoid being shut off, for example, is concerning - the important part isn't whether AI are trained on human behaviors, it's whether a good or bad human actor will accidentally or intentionally allow AI to control something that can hurt people, or a weapon, etc. Fiction like the Three Laws of Robotics at least assumed that we would try to put in place stronger 'laws' before allowing AIs to control such things.

I 100% agree this isn't sentience, but sentience isn't the concerning result for me. (And I think the Three Laws, Skynet, etc. were intended to be cautionary tales.)

AIs can do unexpected things. There was a news story in recent days about how a Cursor agent deleted a company's prod DBs:

> The agent was working on a routine task in our staging environment. It encountered a credential mismatch and decided — entirely on its own initiative — to "fix" the problem by deleting a Railway volume. To execute the deletion, the agent went looking for an API token. It found one in a file completely unrelated to the task it was working on.

➕ show 2 replies

mykytamudryi • today at 1:26 PM

Appreciate your sense of humor :)

alt Hacker News

Replies