Yes, this is a good point. I think not asking to run commands is maybe the most controversial choice...

jahooma • 11/07/2024 • 2 replies • view on HN

Yes, this is a good point. I think not asking to run commands is maybe the most controversial choice we've made so far.

The reason we don't ask for human review is simply: we've found that it works fine to not ask.

We've had a few hundred users so far and usually people are skeptical of this at first, but as they use it they find that they don't want it to ask for every command. It enables cool use cases where Codebuff and iterate by running tests, seeing the error, attempting a fix, and running them again.

If you use source control like git, I also think that it's very hard for things to go wrong. Even if it ran rm -rf from your project directory, you should be able to undo that.

But here's the other thing: it won't do that. Claude is trained to be careful about this stuff and we've further prompted it to be careful.

I think not asking to run commands is the future of coding agents, so I hope you will at least entertain this idea. It's ok if you don't want to trust it, we're not asking you to do anything you are uncomfortable with.

Replies

israrkhan • 11/07/2024

I am not afraid of rm -rf whole directory. I am afraid of other stuff that it can do to my machines. leak my ssh keys, cookies, persnal data, network devices, and making persistent modifications (malware) to my system. Or maybe inadvertently messing with my python version, or globally installing some library to mess up whole system.

➕ show 1 reply

boratanrikulu • 11/07/2024

> it won't do that. Claude is trained to be careful about this stuff and we've further prompted it to be careful.

Could you please explain a bit how you are sure about it?

➕ show 1 reply

alt Hacker News

Replies