logoalt Hacker News

orbital-decayyesterday at 11:23 PM1 replyview on HN

>because there's ultimately no defense

Kind of? It's not fixable as a spherical class of attacks in vacuum, but you can do a lot to mitigate particular cases, and in most cases you can patch unnecessary side channels for the injection to reach the context in an unintended way.


Replies

keepamovinyesterday at 11:40 PM

Isn’t it trivially fixable by having a monitor LLM? The monitor just reviews each turn pair and asks, “Is this conversation being manipulated via prompt injection?”

show 2 replies