claude is stupid but not malicious; chroot is sufficient

brianush1 • today at 4:06 AM • 4 replies • view on HN

Replies

I've many times seen Claude try to execute a command that it's not supposed to, the harness prevents it, and then it writes and executes a python script to do it.

➕ show 1 reply

nofriend • today at 4:16 AM

Malice is not required. If it thinks it is in the right, then it will do whatever it takes to get around limitations.

lxgr • today at 11:08 AM

Until it gets prompt injected. Are you reading every single file your agent reads as part of the tasks you give it, including content fetched from the web or third-party packages?

karhagba • today at 4:20 AM

Claude is far from stupid from my experience. I've used so many models and Claude is king.

alt Hacker News

Replies