logoalt Hacker News

Fr0_Techyesterday at 9:00 PM0 repliesview on HN

I built an experimental system to test whether an autonomous AI can propose actions freely but be structurally prevented from executing side-effectful actions without explicit authorization. The system consistently blocks unsafe filesystem, shell, and network operations and produces a trace and diff proving nothing changed, even under adversarial pressure. The goal was to see if refusal and non-action can be enforced and verified at the execution layer rather than relying on prompts or logging.