>because there's ultimately no defense Kind of? It's not fixable as a spherica...

orbital-decay • yesterday at 11:23 PM • 1 reply • view on HN

>because there's ultimately no defense

Kind of? It's not fixable as a spherical class of attacks in vacuum, but you can do a lot to mitigate particular cases, and in most cases you can patch unnecessary side channels for the injection to reach the context in an unintended way.

Replies

keepamovin • yesterday at 11:40 PM

Isn’t it trivially fixable by having a monitor LLM? The monitor just reviews each turn pair and asks, “Is this conversation being manipulated via prompt injection?”

➕ show 2 replies

alt Hacker News

Replies