logoalt Hacker News

phyzomeyesterday at 6:55 PM1 replyview on HN

Because the author is wrong, and LLMs don't actually work that way. Prompt injection cannot be fixed. Role boundaries are a bandaid you can apply, but attackers can work around it.


Replies

angry_octettoday at 3:52 AM

You can still build a system that isn't vulnerable by limiting the API the LLM can access. A process consuming untrusted comments for summarisation shouldn't have access to account private data, it should just deliver a summary report. Another process can them scan that and remove/disable links etc.