From ChatGPT: This approach can stop very basic scripts, but the claim that “99.9999% of scrapers ...

ATechGuy • yesterday at 12:28 AM • 3 replies • view on HN

From ChatGPT:

This approach can stop very basic scripts, but the claim that “99.9999% of scrapers can’t execute JS or handle cookies” isn’t accurate anymore. Modern scraping tools commonly use headless browsers (Playwright, Puppeteer, Selenium), execute JavaScript, support cookies, and spoof realistic user agents. Any scraper beyond the most trivial will pass a JS-set cookie check without effort. That said, using a lightweight JS challenge can be reasonable as one signal among many, especially for low-value content and when minimizing user friction is a priority. It’s just not a reliable standalone defense. If it’s working for you, that likely means your site isn’t a high-value scraping target — not that the technique is fundamentally robust.

Replies

efilife • yesterday at 12:37 AM

From someone who actually does this stuff:

The claim is very accurate. Maybe not for the biggest websites, but very accurate for a self-hosted blog. You are not that important to waste compute power to set up a whole ass headless browser to scrape your page. Why am I even arguing with ChatGPT?

➕ show 1 reply

phyzome • yesterday at 12:45 AM

There should be a new rule on HN: No posts that just go "I asked an LLM and it said..."

You're not adding anything to the conversation.

➕ show 1 reply

6031769 • yesterday at 4:34 PM

So an LLM says that a technique used to foil LLM scrapers is ineffective against LLM scrapers.

It's almost as if it might have an ulterior motive in saying so.

alt Hacker News

Replies