logoalt Hacker News

lambdatoday at 1:41 AM5 repliesview on HN

Why do you expect that a weighted random text generator will ever behave in predictable way?

How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

This is absolutely insane behavior that you would give Claude access to your GitHub creds. What happens when it sees a prompt injection attack somewhere and exfiltrates all of your creds or wipes out all of your repos?

I can't believe how far people have fallen for this "AI" mania. You are giving a stochastic model that is easily misdirected the keys to all of your productive work.

I can understand the appeal to a degree, that it can seem to do useful work sometimes.

But even so, you can't trust it with anything, not running it in a locked down container that has no access to anything but a Git repo which has all important history stored elsewhere seems crazy.

Shouting harder and harder at the statistical model might give you a higher probability of avoiding the bad behavior, but no guarantee; actually lock down your random text generator properly if you want to avoid it causing you problems.

And of course, given that you've seen how hard it is to get it follow these instructions properly, you are reviewing every line of output code thoroughly, right? Because you can't trust that either.


Replies

alwillistoday at 7:14 AM

Claude Code hooks are deterministic; the agent can’t bypass them [1].

For example you force a linter to run or for tests to run.

Claude Code defaults to running in a sandbox on macOS and Linux. Claude Cowork runs in a Linux VM.

[1]: https://code.claude.com/docs/en/hooks-guide

rimunroetoday at 2:53 AM

> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

> This is absolutely insane behavior that you would give Claude access to your GitHub creds. What happens when it sees a prompt injection attack somewhere and exfiltrates all of your creds or wipes out all of your repos?

I don’t understand why people are so chill about doing this. I have AI running on a dedicated machine which has absolutely no access to any of my own accounts/data. I want that stuff hardware isolated. The AI pushes up work to a self-hosted Gitea instance using a low-permission account. This setup is also nice because I can determine provenance of changes easily.

ex-aws-dudetoday at 3:26 AM

The answer is that for these people most of the time it looks predictable so they start to trust it

The tool is so good at mimicking that even smart people start to believe it

matkoniecztoday at 4:49 AM

> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

Because it is much easier to do and failure rate is quite low.

(not saying that it is a good idea)

cruffle_duffletoday at 3:06 AM

> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

Because it’s insanely useful when you give it access, that’s why. They can do way more tasks than just write code. They can make changes to the system, setup and configure routers and network gear, probe all the iot devices in the network, set up dns, you name it—anything that is text or has a cli is fair game.

The models absolutely make catastrophic fuckups though and that is why we’ll have to both better train the models and put non-annoying safeguards in front of them.

Running them in isolated computers that are fully air gapped, require approval for all reads and writes, and can only operate inside directories named after colors of the rainbow is not a useful suggestion. I want my cake and I want to eat it too. It’s far to useful to give these tools some real access.

It doesn’t make me naive or stupid to hand the keys over to the robot. I know full well what I’m getting myself into and the possible consequences of my actions. And I have been burned but I keep coming back because these tools keep getting better and they keep doing more and more useful things for me. I’m an early adopter for sure…