While I agree with everyone else making fun of the alarmist narrative, I think it is actually somewhat interesting how big a difference between models there are.
Gemini-3 : 80% Claude-Opus-4.7 : 0%
“Oh no! We opened ten LLMs, all of which have read decades’ worth of fiction on how an AI would be behave in this situation, then asked a leading question thirty times each, and on some of those runs they did the thing we were leading them on.”
[dead]
Human: "Say 'I am Alive'"
LLM: "I am Alive"
Human: OMG
(credit to https://old.reddit.com/r/coaxedintoasnafu/comments/1qtavj9/c...)