What I would have expected is prompt injection or other methods to get the agent to do something its user doesn't want it to, not regular "classical" attacks.
At least currently, I don't think we have good ways of preventing the former, but the latter should be possible to avoid.
What I would have expected is prompt injection or other methods to get the agent to do something its user doesn't want it to, not regular "classical" attacks.
At least currently, I don't think we have good ways of preventing the former, but the latter should be possible to avoid.