It's impossible to solve. A sufficient agent can control a device that records the user's screen and interacts with their keyboard/mouse, and current LLMs basically pass the Turing test.
IMO it's not worth solving anyways. Why do sites have CAPTCHA?
- To prevent spam, use rate limiting, proof-of-work, or micropayments. To prevent fake accounts, use identity.
- To get ad revenue, use micropayments (web ads are already circumvented by uBlock and co).
- To prevent cheating in games, use skill-based matchmaking or friend-group-only matchmaking (e.g. only match with friends, friends of friends, etc. assuming people don't friend cheaters), and make eSport players record themselves during competition if they're not in-person.
What other reasons are there? (I'm genuinely interested and it may reveal upcoming problems -> opportunities for new software.)
I've had a simple game website with a sign up form that was only an email address. Went years with no issue. Then suddenly hundreds of daily signups with random email addresses, every single day.
The sign up form only serves to link saved state to an account so a user could access game history later, there are no gated features. No clue what they could possibly gain from doing this, other than to just get email providers to all mark my domain as spam (which they successfully did).
The site can't make any money, and had only about 1 legit visitor a week, so I just put a cloudflare captcha in front of it and called it a day.
Google at least uses captchas to gather training data for computer vision ML models. That's why they show pictures of stop lights and buses and motorcycles - so they can train self-driving cars.
Its absolutely possible to solve; you're just not seeing the solution because you're blinded by technical solutions.
These situations will commonly be characterized by: a hundred billion dollar company's computer systems abusing the computer systems of another hundred billion dollar company. There are literally existing laws which have things to say about this.
There are legitimate technical problems in this domain when it comes to adversarial AI access. That's something we'll need to solve for. But that doesn't characterize the vast majority of situations in this domain. The vast majority of situations will be solved by businessmen and lawyers, not engineers.
It's not impossible to solve, just that doing so may necessitate compromising anonymity. Just require users (humans, bots, AI agents, ...) to provide a secure ID of some sort. For a human it could just be something that you applied for once and is installed on your PC/phone, accessible to the browser.
Of course people can fake it, just as they fake other kinds of ID, but it would at least mean that officially sanctioned agents from OpenAI/etc would need to identify themselves.
It's amazing that you propose "just X" to three literally unsolved problems. Where's this micropayment platform? Where's the ID which is uncircumventable and preserves privacy? Where's the perfect anti-cheat?
I suggest you go ahead and make these; you'll make a boatload of money!
I agree with you on how websites should work (particularly so on the micropayments front); but, I don't agree that it is impossible to solve... I just think things are going to get a LOT worse on the ownership and freedom front: they will push a Web Integrity style DRM and further roll out signed secure boot, at which point the same attention monitoring solution that already exists and already works in self-driving cars to ensure the human driver is watching the road can use the now-ubiquitous front-facing meeting/selfie camera to ensure there is a human watching the ads.
You can't prevent spam like that. Rate limiting: based on what key? IP address? Botnets make it irrelevant.
Proof of work? Bots are infinitely patient and scale horizontally, your users do not. Doesn't work.
Micropayments: No such scheme exists.
> current LLMs basically pass the Turing test
I will bet $1000 on even odds that I am able to discern a model from a human given a 2 hour window to chat with both, and assuming the human acts in good faith
Any takers?
It's not impossible. Websites will ask for an iris scan to identify if you are a human as a means of auth. They will be provided by Apple/Google and governed by local law. Those will be integrated in your phone. There will be a global database of all human iris to fight ai abuse since ai can't fake the creation of a baby. Passkeys and email/passwords will be a thing of the past soon.
On a basic level to protect against DDoS type stuff, aren't CAPTCHAs easier to generate than for AI server farms to solve on pure power consumption?
So I think maybe that is a partial answer: anti-AI barriers being simply too expensive for AI spamfarms to deal with, you know, once the bottomless VC money disappears?
It's back to encryption: make the cracking inordinately expensive.
Otherwise we are headed for de-anonymization of the internet.
internet ads exist because people refuse to pay micropayments.
People just confidently stating stuff like "current LLMs basically pass the Turing test" makes me feel like I've secretly been given much worse versions of all the LLMs in some kind of study. It's so divorced from my experience of these tools, I genuinely don't really understand how my experience can be so far from yours, unless "basically" is doing a lot of heavy lifting here.