Adversarial cases are not the same thing as prompt injection.

Lerc • today at 2:13 AM • 1 reply • view on HN

Replies

adversarial examples, or test-time attacks, was a whole field of machine learning security way before LLMs came around.

give the model a specially crafted bad input at inference time so attacker can get some nasty output, potentially defeating any existing defences in the process. [0]

in “modern llm lingo” defence = guardrails and / or system prompts.

prompts used for prompt injection are a form of adversarial example (people just like inventing new terminology when a new fad comes along).

[0]: i wrote the above myself about adv. ex, but i’ve just checked OWASP’s listing on prompt injection and it’s pretty close: https://owasp.org/www-community/attacks/PromptInjection

➕ show 1 reply

alt Hacker News

Replies