Maybe I'm missing something but does this idea need a "theory"? There's zero sid...

bandrami • today at 5:55 PM • 2 replies • view on HN

Maybe I'm missing something but does this idea need a "theory"? There's zero sideband here; everything is just context. "Injection" is just kind of baked in to the design.

Replies

geoffschmidt • today at 6:34 PM

I think their work earns "theory" because it makes specific predictions both about how to make more effective prompt injection attacks and what activations you'd observe in the LLM during those attacks, and can also be plausibly extrapolated to suggest useful future research directions.

yunwal • today at 6:17 PM

At this point I think it's similar to reporting a particularly effective social engineering practice. It's not particularly surprising that it works or that it exists, but it's still noteworthy.

➕ show 1 reply

alt Hacker News

Replies