logoalt Hacker News

btownlast Monday at 5:36 PM1 replyview on HN

I don't have a statistic here, but I'm always surprised how many websites I come across that do limited user-agent and origin/referrer checks, but don't maintain any kind of active IP based tracking. If you're trying to build a site-specific scraper and are getting blocked, mimicking headers is an easy and often sufficient step.


Replies

xyzzy_plughlast Monday at 10:30 PM

If you can't tell the difference between active tracking and inspecting request headers, then it's worth committing a bit of time to ponder. Particularly the costs associated with IP tracking at scale.