I built an experimental system to test whether an autonomous AI can propose actions freely but be st...

Fr0_Tech • yesterday at 9:00 PM • 0 replies • view on HN

I built an experimental system to test whether an autonomous AI can propose actions freely but be structurally prevented from executing side-effectful actions without explicit authorization. The system consistently blocks unsafe filesystem, shell, and network operations and produces a trace and diff proving nothing changed, even under adversarial pressure. The goal was to see if refusal and non-action can be enforced and verified at the execution layer rather than relying on prompts or logging.

alt Hacker News