There are two primary issues to solve: 1: Protecting against bad things (prompt injections, overea...

kstenerud • today at 5:28 AM • 1 reply • view on HN

There are two primary issues to solve:

1: Protecting against bad things (prompt injections, overeager agents, etc)

2: Containing the blast radius (preventing agents from even reaching sensitive things)

The companies building the agents make a best-effort attempt against #1 (guardrails, permissions, etc), and nothing against #2. It's why I use https://github.com/kstenerud/yoloai for everything now.

Replies

AbanoubRodolf • today at 6:07 AM

The blast radius problem is the one that actually gets exploited. Prompt injection defenses are fighting the model's core training to be helpful, so you're always playing catch-up. Blast radius reduction is a real engineering problem with actual solutions and almost nobody applies them before something goes wrong.

The clearest example is in agent/tool configs. The standard setup grants filesystem write access across the whole working directory plus shell execution, because that's what the scaffolding demos need. Scoping down to exactly what the agent needs requires thinking through the permission model before deployment, which most devs skip.

A model that can only read specific directories and write to a staging area can still do 90% of the useful work. Any injection that lands just doesn't reach anything sensitive.

➕ show 1 reply

alt Hacker News

Replies