logoalt Hacker News

lich_kingtoday at 2:21 PM2 repliesview on HN

You break highlighting and copy-and-paste. If I want to share or comment on a piece of your website... I can't. I guess this can be a "feature" in some rare cases, but a major usability pain otherwise.

I'm not a fan of all the documentation and marketing content for this project evidently being AI-generated because I don't know which parts of it are the things you believe and designed for, and which are just LLM verbal diarrhea. For example, your GitHub threat model says this stops "AI training crawlers (GPTBot, ClaudeBot, CCBot, etc.)" - is this something you've actually confirmed, or just something that AI thinks is true? I don't know how their scrapers work; I'd assume they use headless browsers.


Replies

larsmosrtoday at 2:39 PM

Copy-paste breaking is intentional for protected content but it's opt-in per component, not whole-site.

On the AI docs concern, fair point. To answer directly: I've confirmed the obfuscation defeats any scraper reading raw HTML via HTTP requests. Whether GPTBot or ClaudeBot use headless browsers internally, I honestly don't know. The README threat model lists headless browsers under "what it does NOT stop" for that reason.

show 1 reply
larsmosrtoday at 3:02 PM

Full user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.3;

Official OpenAI documentation: https://platform.openai.com/docs/gptbot