logoalt Hacker News

jychangyesterday at 11:14 PM2 repliesview on HN

Isn't that exactly what stopping SQL injection involves? No longer executing random SQL code.

Same thing would work for LLMs- this attack in the blog post above would easily break if it required approval to curl the anthropic endpoint.


Replies

stavrosyesterday at 11:17 PM

No, that's not what's stopping SQL injection. What stops SQL injection is distinguishing between the parts of the statement that should be evaluated and the parts that should be merely used. There's no such capability with LLMs, therefore we can't stop prompt injections while allowing arbitrary input.

show 1 reply
Xirdusyesterday at 11:34 PM

SQL injection is possible when input is interpreted as code. The protection - prepared statements - works by making it possible to interpret input as not-code, unconditionally, regardless of content.

Prompt injection is possible when input is interpreted as prompt. The protection would have to work by making it possible to interpret input as not-prompt, unconditionally, regardless of content. Currently LLMs don't have this capability - everything is a prompt to them, absolutely everything.

show 1 reply