logoalt Hacker News

vbezhenaryesterday at 8:59 PM4 repliesview on HN

Operating systems should prevent privilege escalations, antiviruses should detect viruses, police should catch criminals, claude should detect prompt injections, ponies should vomit rainbows.


Replies

viraptoryesterday at 10:27 PM

Claude doesn't have to prevent injections. Claude should make injections ineffective and design the interface appropriately. There are existing sandboxing solutions which would help here and they don't use them yet.

show 1 reply
nezharyesterday at 10:11 PM

I believe the detection pattern may not be the best choice in this situation, as a single miss could result in significant damage.

eliyesterday at 9:24 PM

I don't think those are all equivalent. It's not plausible to have an antivirus that protects against unknown viruses. It's necessarily reactive.

But you could totally have a tool that lets you use Claude to interrogate and organize local documents but inside a firewalled sandbox that is only able to connect to the official API.

Or like how FIDO2 and passkeys make it so we don't really have to worry about users typing their password into a lookalike page on a phishing domain.

show 3 replies
pegasusyesterday at 9:26 PM

Operating systems do prevent some privilege escalations, antiviruses do detect some viruses,..., ponies do vomit some rainbows?? One is not like the others...