logoalt Hacker News

dotancohenyesterday at 9:07 PM1 replyview on HN

The real problem is that there is nothing novel here. Variants of this type of attack were clear from the beginning.


Replies

lxgryesterday at 9:26 PM

What I would have expected is prompt injection or other methods to get the agent to do something its user doesn't want it to, not regular "classical" attacks.

At least currently, I don't think we have good ways of preventing the former, but the latter should be possible to avoid.

show 2 replies