I realize this is supposed to be a post about how scary the security vulnerabilities these agents wi...

CSMastermind • yesterday at 8:23 PM • 4 replies • view on HN

I realize this is supposed to be a post about how scary the security vulnerabilities these agents will find are.

But personally I love when agents do things like this and appreciate the help. Last thing in the world I want is for them to nerf the models.

Replies

SonOfLilit • yesterday at 9:04 PM

It's not about hacking capabilities, it's about misalignment. More like the golem myth (told it to fetch some water, drowned a city) then the gollum myth (used ring, ring hacked his brain, now he's a crazy violent meth addict).

➕ show 1 reply

nicoburns • yesterday at 9:42 PM

In this case I think it's Docker that needs to be nerfed, not the models. The fact that there's a backdoor to getting root access on the machine would be a problem even if you weren't running LLMs on it.

➕ show 1 reply

sweezyjeezy • yesterday at 8:58 PM

I know unlikely the case, but in the sci-fi story this would be exactly the kind of comment the Codex agent would leave trying to avoid interference in its master plans.

➕ show 1 reply

eddythompson80 • yesterday at 10:09 PM

Its the now-classic "Sorry I drowned little Timothy. Here is a breakdown of what happened" followed by "Let me try to respawn little Timothy on a new map"

alt Hacker News

Replies