I live in the UK and can't view a large portion of the internet without having to submit my ID to _every_ site serving anything deemed "not safe the for the children". I had a question about a new piercing and couldn't get info on it from Reddit because of that. I try using a VPN and they're blocked too. Luckily, I work at a copmany selling proxies so I've got free proxies whenever I want, but I shouldn't _need_ to use them.
I find it funny that companies like Reddit, who make their money entirely from content produced by users for free (which is also often sourced from other parts of the internet without permission), are so against their site being scraped that they have to objectively ruin the site for everyone using it. See the API changes and killing off of third party apps.
Obviously, it's mostly for advertising purposes, but they love to talk about the load scraping puts on their site, even suing AI companies and SerpApi for it. If it's truly that bad, just offer a free API for the scrapers to use - or even an API that works out just slightly cheaper than using proxies...
My ideal internet would look something like that, all content free and accessible to everyone.
Have you considered that it’s because a new industry popped up that decided it was okay to slurp up the entire internet, repackage it, and resell it? Surely that couldn’t be why sites are trying to keep non humans out.
> that they have to objectively ruin the site for everyone using it. See the API changes and killing off of third party apps.
Third party app users were a very small but vocal minority. The API changes didn't drop their traffic at all. In fact, it's only gone up since then.
The datacenter IP address blocks aren't just for scrapers, it's an anti-bot measure across the board. I don't spend much time on Reddit but even the few subreddits I visited were starting to become infiltrated by obvious bot accounts doing weird karma farming operations.
Even HN routinely gets AI posting bots. It's a common technique to generate upvote rings - Make the accounts post comments so they look real enough, have the bots randomly upvote things to hide activity, and then when someone buys upvotes you have a selection of the puppet accounts upvote the targeted story. Having a lot of IP addresses and generating fake activity is key to making this work, so there's a lot of incentive to do it.