If that's the case then user-facing products that can take any useful action are strictl...

jcgrillo • yesterday at 6:49 PM • 1 reply • view on HN

If that's the case then user-facing products that can take any useful action are strictly off the table.

Replies

I'll play advocatus diaboli for once here.

Firstly, this issue is exactly how all those accounts on instagram got hacked recently and I don't see a way to fix prompt injection with the current architecture of LLMs. I strongly suspect it is entirely impossible to achieve.

But, that doesn't mean that all useful actions are forbidden. The important part is identifying maximum and minimum harms. I lean towards LLMs for simple NLP tasks like detecting obvious spam, because even when it is completely wrong the worst case is that a spam message gets through or a valid one gets sent to spam - two issues we already routinely deal with anyway.

➕ show 1 reply

alt Hacker News

Replies