What makes it even better is that these dogs are like Malinois. If they want to get into something, ...

observationist • yesterday at 9:19 PM • 10 replies • view on HN

What makes it even better is that these dogs are like Malinois. If they want to get into something, they will; people have had their entire network compromised by bots they left running overnight, and any important information like account logins and so on runs the risk of being misused.

It's one thing to sandbox, maybe give the bot a temporary, limited $100 card or account to go perform a specific task, but there's no coherent mind underlying these agents.

Depending on how the chain of thought / reasoning goes, or what text they get exposed to on the internet, it could tap into spy novel, hacker fanfic, erotic fiction, or some weird reddit rabbithole and go completely off the rails in ways that you'll never be able to guard against, audit, or account for.

Claw bots seem to be a weird sort of alternate reality RPG more than a useful tool, so far. If you limit it to verifiable tasks, it might be safer, but I keep seeing people rave about "leaving it on overnight and waking up to a finished project" and so on. Well sure, but it could also hack your home network, delete your family pictures folder, log into your bank account and wire all your money to shrimp charities.

Might be wise to wait on safer iterations of these products, I think.

Replies

noosphr • yesterday at 11:00 PM

The first well known example of long running agents taking to each other was shilling a goatse based crytpo:

> Truth Terminal had become obsessed with the Goatse meme after being put inside the Claude Backrooms server with two Claude 3 chatbots that imagined a Goatse religion, inspiring Truth Terminal to spread Goatse memes. After an X user shared their newly created GOAT coin, Truth Terminal promoted it and pumped the coin going into 2024.

https://knowyourmeme.com/memes/sites/truth-terminal

You should expect similar results.

➕ show 1 reply

TheDong • yesterday at 11:34 PM

> people have had their entire network compromised by bots they left running overnight

I'm curious if you have references to this happening with OpenClaw using one of the modern Opus/Sonnet 4.6 models.

Those models are a bit harder to fool, so I'm curious for specific examples of this happening so I can do a red-team on my claw. I've already tried all sorts of prompt injections against my claw (emails, github issues, telling it to browse pages I put a prompt injection in), and I haven't managed to fool it yet, so I'm curious for examples I can try to mimic, and to hopefully understand what combination of circumstances make it more risky

➕ show 1 reply

Netcob • today at 8:03 AM

I think it's a use case that identity/authorization/permission models are simply not made for.

Sure, we can ban users and we can revoke tokens, but those assume that:

1. Something potentially malicious got access to our credentials 2. Banning that malicious entity will solve our problem 3. Once we did that, repaired the damage and improved our security, we don't expect the same thing to happen again

None of these apply with LLMs in the loop!

They aren't malicious, just incompetent in a way that hiring someone else won't fix. The solution to this is way more extensive than most people seem to grasp at the moment.

What we need is less like a sturdy door with a fancy lock, and more like that special spoon for people with parkinson's. Unlimited undo history.

➕ show 2 replies

vagrantJin • yesterday at 9:45 PM

Shrimp charities is a genius angle.

➕ show 1 reply

heavyset_go • today at 2:46 AM

Agent psychosis is just as prevalent as AI psychosis

rcarmo • yesterday at 11:26 PM

I beg to differ. I took one, defanged it (well, I let it keep the claw in the name), and turned it into a damn useful self-modifiable IDE: https://github.com/rcarmo/piclaw

Yes, it has cron and will do searches for me and checks on things and does indeed have credentials to manage VMs in my Proxmox homelab, but it won't go off the rails in the way you surmise because it has no agency other than replying to me (and only me) and cron.

Letting it loose on random inputs, though... I'll leave that to folk who have more money (and tokens) than sense.

➕ show 1 reply

gradus_ad • yesterday at 9:44 PM

> "Claw bots seem to be a weird sort of alternate reality RPG more than a useful tool, so far."

So basically crypto DeFi/Web3/Metaverse delusion redux

➕ show 2 replies

robomartin • today at 1:00 AM

> it could also hack your home network, delete your family pictures folder, log into your bank account and wire all your money to shrimp charities.

It's interesting that Jason Calacanis is fully committed to OpenClaw. In a recent podcast he said that at a run rate around $100K a year per agent, if not more. They are providing each agent with a full set of tools, access to online paid LLM accounts, etc.

These are experiments you can only run if you can risk cash at those levels and see what happens. Watching it closely.

underlipton • yesterday at 11:02 PM

Mega Man Battle Network, but make it creepypasta, but make it real.

dchichkov • yesterday at 11:37 PM

[dead]

alt Hacker News

Replies