logoalt Hacker News

imiric10/04/20241 replyview on HN

I get that argument, as someone who uses those privacy-preserving methods. I've dealt with annoying CAPTCHAs for many years. The problem is that a CAPTCHA by definition is unable to do its job unless it can gather as much information as possible about the user. There are obvious privacy concerns here, but companies that operate under regulations like the GDPR are generally more conscious about this.

So what should be the correct behavior if the CAPTCHA can't gather enough information? Should it default to assuming the user is a bot or a human?

I think this decision should depend on each site, depending on how strict they want the behavior to be. So it's a configuration setting, rather than a CAPTCHA problem.

In a broader sense, think about the implications of not using a CAPTCHA. The internet is overrun with bots; they comprise an estimated 36% of global traffic[1]. Cases like ProductHunt are not unique, and we see similar bot statistics everywhere else. These numbers will only increase as AI gets more accessible, making the current web practically unusable for humans.

If you see a better alternative to CAPTCHAs I'd be happy to know about it, but to me it's clear that the path forward is for websites to detect who is or isn't a bot, and restrict access accordingly. So working on improving these tools, in both detection accuracy and UX, should be our main priority for mitigating this problem.

[1]: https://investors.fastly.com/news/news-details/2024/New-Fast...


Replies

capitainenemo10/04/2024

So, I have a few objections here. First off, CAPTCHAs are not "by definition" about fingerprinting users. They are "by definition" a turing test for distinguishing humans from bots. It just turns out that is hard to do, so CAPTCHAs pivoted to fingerprinting instead. Secondly, sites often are unaware or not given the choice. Businesses are sold the idea that they are being protected against bots, when in fact they are turning away real users. Many I contacted were unaware this was happening. In fact, the servers in between are not even integrated in a way to support a reasonable fallback. For example, on some sites (FedEx, Kickstarter) the "captcha" is returned by a JSON API that is completely unable to handle it or present it to the user. Thirdly, the fingerprinting is broadly applied with NO exceptions. You would think a simple heuristic would be "the user has used this IP for the past 5 years to authenticate to this website, with the same browser UA - we can probably let them through" but, no, they kick it over to a third party automated system, one that can completely break authentication, to fingerprint their users, on pages with personal information at that. They often don't offer any other options either, like additional auth challenges.

So, yeah, people are being told "well, we have to fingerprint users, we have no choice" and the ironic thing is the battle is being lost anyway, and real damage is being done to in the false positives, esp if the site is tech savvy.

But whatever. I'm aware I won't convince you, I'm aware I'm in the minority, most people are accept the status quo, or are unaware of the abuses, but it's being implemented poorly, it isn't working, it's harming real people and the internet as a whole, and it is not an adequate fix.

show 1 reply