The odds of managing to carry out a prompt injection attack or gain meaningful control through the t...

vidarh • last Sunday at 1:48 PM • 1 reply • view on HN

The odds of managing to carry out a prompt injection attack or gain meaningful control through the training data seems sufficiently improbable that that we're firmly in Russell's teapot territory - extraordinary evidence required that it is even possible, unless you suspect your LLM provider itself, in which case you have far bigger problems and no exploit of the training data is necessary.

Replies

grafmax • last Sunday at 4:59 PM

You need to consider all the users of the LLM, not a specific target. Such attacks are broad not targeted, a bit like open source library attacks. Such attacks formerly seemed improbable but are now widespread.

alt Hacker News

Replies