Exfiltrated without a Pwn2Own in 2 days of release and 1 day after my comment [0], despite "san...

rvz • yesterday at 9:04 PM • 1 reply • view on HN

Exfiltrated without a Pwn2Own in 2 days of release and 1 day after my comment [0], despite "sandboxes", "VMs", "bubblewrap" and "allowlists".

Exploited with a basic prompt injection attack. Prompt injection is the new RCE.

[0] https://news.ycombinator.com/item?id=46601302

Replies

ramoz • yesterday at 9:10 PM

Sandboxes are an overhyped buzzword of 2026. We wanna be able to do meaningful things with agents. Even in remote instances, we want to be able to connect agents to our data. I think there's a lot of over-engineering going there & there are simpler wins to protect the file system, otherwise there are more important things we need to focus on.

Securing autonomous, goal-oriented AI Agents presents inherent challenges that necessitate a departure from traditional application or network security models. The concept of containment (sandboxing) for a highly adaptive, intelligent entity is intrinsically limited. A sufficiently sophisticated agent, operating with defined goals and strategic planning, possesses the capacity to discover and exploit vulnerabilities or circumvent established security perimeters.

alt Hacker News

Replies