logoalt Hacker News

dijksterhuistoday at 2:00 AM1 replyview on HN

it’s not a problem that came into existence a few years ago. we’ve known about these sorts of test time attacks for decades now. prompt injection is just the LLM variant where people use less math to perform the attacks, brute force with prompts they saw on twitter and get horrible images/text out.

https://people.eecs.berkeley.edu/~tygar/papers/Machine_Learn...

https://arxiv.org/abs/1712.03141

it’s a basic property of all machine learning models. at a low level it’s to do with how decision boundaries work.

but, good news! there are two sure fire ways to fully fix the problem! see: https://news.ycombinator.com/item?id=48579456


Replies

Lerctoday at 2:13 AM

Adversarial cases are not the same thing as prompt injection.

show 1 reply