How is the second LLM not also vulnerable from prompt injection? In order to supervise the first, it...

snailmailman • today at 4:37 PM • 2 replies • view on HN

How is the second LLM not also vulnerable from prompt injection? In order to supervise the first, it must receive data (presumably output from the first LLM?). All generated output after the user input is in the context should be considered possibly compromised/prompt injected. Having a second LLM just adds more obfuscation, but prompt injection could be chained.

Replies

j_w • today at 6:00 PM

That's when you bust out the third LLM. Nobody expects the fourth LLM to be the REAL LLM in the chain.

tweetle_beetle • today at 5:15 PM

Quis custodiet ipsos custodes?

alt Hacker News

Replies