logoalt Hacker News

verdvermyesterday at 9:35 PM6 repliesview on HN

Why is this interesting?

Is it a shade of gray from HN's new rule yesterday?

https://news.ycombinator.com/item?id=47340079

Personally, the other Ai fail on the front of HN and the US Military killing Iranian school girls are more interesting than someone's poorly harnessed agent not following instructions. These have elements we need to start dealing with yesterday as a society.

https://news.ycombinator.com/item?id=47356968

https://www.nytimes.com/video/world/middleeast/1000000107698...


Replies

acherionyesterday at 9:40 PM

I think it's because the LLM asked for permission, was given a "no", and implemented it anyway. The LLM's "justifications" (if you were to consider an LLM having rational thought like a human being, which I don't, hence the quotes) are in plain text to see.

I found the justifications here interesting, at least.

antdkeyesterday at 9:38 PM

Well, imagine this was controlling a weapon.

“Should I eliminate the target?”

“no”

“Got it! Taking aim and firing now.”

show 4 replies
nielsoleyesterday at 9:39 PM

Opus being a frontier model and this being a superficial failure of the model. As other comments point out this is more of a harness issue, as the model lays out.

show 1 reply
Swizecyesterday at 9:43 PM

Because the operator told the computer not to do something so the computer decided to do it. This is a huge security flaw in these newfangled AI-driven systems.

Imagine if this was a "launch nukes" agent instead of a "write code" agent.

show 1 reply
mmanfrinyesterday at 9:43 PM

How is this not clear?

show 1 reply
bakugoyesterday at 9:53 PM

It's interesting because of the stark contrast against the claims you often see right here on HN about how Opus is literally AGI

show 1 reply