Seems to me that this is just social engineering turned to LLMs, right?
I already have to raise quite a bit of awareness to humans to not trust external sources, and do a risk based assessment of requests. We need less trust for answering a service desk question, than we need for paying a large invoice.
I believe we should develop the same type of model for agents. Let them do simple things with little trust requirements, but risky things (like running an untrusted script with root privileges) only when they are thoroughly checked.