logoalt Hacker News

znnajdlatoday at 5:33 AM7 repliesview on HN

I absolutely refuse to use BigTech gatekeepers or useless CAPTCHAS (any sufficiently advanced bot can get around any CAPTCHA anyway). We solved this at our startup by running names through a simple LLM filter - if the name is gibberish like Px2846skxojw just block the signup. Worked surprisingly well. Of course this is easy to get around if the bot knows what you’re doing. But bots look for easy targets, as long as there are enough vibe coded crap targets on the internet they’re not going to bother with circumventing a carefully designed app.


Replies

snowe2010today at 5:50 AM

Then you’re also blocking legitimate users that don’t want to be tracked and use services like iCloud Hide my Emails

show 2 replies
aviantoday at 8:15 AM

> We solved this at our startup by running names through a simple LLM filter - if the name is gibberish like Px2846skxojw just block the signup.

I hope "LLM thinks your name is gibberish" won't become the new "your name can't include invalid characters".

steezeburgertoday at 6:12 AM

This doesn't seem like a very good solution to be honest. And why use an LLM for this? What if I want a legit random ass string as my username?

show 1 reply
tholmtoday at 5:47 AM

Using an LLM for this seems excessive when there are well established algorithms for detecting high entropy strings.

show 1 reply
latexrtoday at 7:32 AM

> useless CAPTCHAS (any sufficiently advanced bot can get around any CAPTCHA anyway). We solved this at our startup by (…). Of course this is easy to get around if the bot knows what you’re doing

So, by your own admission, your solution doesn’t get around the “sufficiently advanced bot” problem.

show 2 replies
imirictoday at 6:09 AM

So your solution is to deploy a black box that can be worked around with a basic lookup table for a single field?

CAPTCHAs were never meant to work 100% of the time in all situations, or be the only security solution. They're meant to block lazy spammers and low-level attacks, but anyone with enough interest and resources can work around any CAPTCHA. This is certainly becoming cheaper and more accessible with the proliferation of "AI", but it doesn't mean that CAPTCHAs are inherently useless. They're part of a perpetual cat and mouse game.

Like LLMs, they rely on probabilities that certain signals may indicate suspicious behavior. Sophisticated ones like Turnstile analyze a lot of data, likely using LLMs to detect pseudorandom keyboard input as well, so they would be far more effective than your bespoke solution. They're not perfect, and can have false positives, but this is unfortunately the price everyone has to pay for services to be available to legitimate users on the modern internet.

I do share a concern that these services are given a lot of sensitive data which could potentially be abused for tracking users, advertising, etc., but there are OSS alternatives you can self-host that mitigate this.

mads_quisttoday at 5:35 AM

Nice.