Operating systems should prevent privilege escalations, antiviruses should detect viruses, police should catch criminals, claude should detect prompt injections, ponies should vomit rainbows.
I believe the detection pattern may not be the best choice in this situation, as a single miss could result in significant damage.
I don't think those are all equivalent. It's not plausible to have an antivirus that protects against unknown viruses. It's necessarily reactive.
But you could totally have a tool that lets you use Claude to interrogate and organize local documents but inside a firewalled sandbox that is only able to connect to the official API.
Or like how FIDO2 and passkeys make it so we don't really have to worry about users typing their password into a lookalike page on a phishing domain.
Operating systems do prevent some privilege escalations, antiviruses do detect some viruses,..., ponies do vomit some rainbows?? One is not like the others...
Claude doesn't have to prevent injections. Claude should make injections ineffective and design the interface appropriately. There are existing sandboxing solutions which would help here and they don't use them yet.