This highlights a fundamental challenge with AI assistants: they need broad access to be useful, but...

indiekitai • today at 1:59 PM • 4 replies • view on HN

This highlights a fundamental challenge with AI assistants: they need broad access to be useful, but that access is hard to scope correctly.

The bug is fixable, but the underlying tension—giving AI tools enough permissions to help while respecting confidentiality boundaries—will keep surfacing in different forms as these tools become more capable.

We're essentially retrofitting permission models designed for human users onto AI agents that operate very differently.

Replies

pjc50 • today at 3:53 PM

Crucially, this wouldn't be an issue if the AI ran locally, but "sending all your internal email in cleartext to the cloud" is a potentially serious problem for organizations with real confidentiality requirements.

SignalStackDev • today at 6:03 PM

The retrofitting problem is real, but there's a more specific design failure worth naming: the data flows in the wrong direction.

In traditional access control, the pattern is: user requests data -> permissions checked -> data returned or denied. The model never sees unauthorized data.

With Copilot and most LLM agents today, the pattern is: user asks question -> model retrieves broadly -> sensitivity label checked as a filter -> model generates answer. The label-checking happens after the data is already in the model's context.

That's the bug waiting to happen, label system or not. You can't reliably instruct a model to 'ignore what you just read.'

The pattern that actually works - and I've had to build this explicitly for agent pipelines - is pre-retrieval filtering. The model emits a structured query (what it needs), that query gets evaluated against a permission layer before anything comes back, and only permitted content enters the context window. The model architecturally can't see what it's not allowed to see.

The DLP label approach is trying to solve a retrieval problem with a generation-time filter. It's a category error, and it'll keep producing bugs like this one regardless of how good the label detection gets.

hippo22 • today at 2:00 PM

How is this different than any other access control system?

➕ show 3 replies

jrjeksjd8d • today at 2:10 PM

I think the fundamental tension is that AI produces a high volume of low quality output, and the human in the loop hates reviewing all the slop. So people want to just let the AI interface directly, but when you let slop into the real world there are consequences.

alt Hacker News

Replies