Bulletproof solution: captcha where you drag a cartoon wire to one of several holes, captioned “for access, hack this phone system”
No agent will touch it!
“As a large language model, I don’t hack things”
I actually have had some success with AI "red-teaming" against my systems to identify possible exploits.
What seems to be a better CAPTCHA, at least against non-Musk LLMs is to ask them to use profanities; they'll generally refuse even when you really insist.
Captcha: "Draw a human hand with the correct number of fingers"
AI agent: *intense sweating*