logoalt Hacker News

jsheard01/16/20253 repliesview on HN

For the "good" bots which at least respect robots.txt you can use this list to get ahead of them before they pummel your site.

https://github.com/ai-robots-txt/ai.robots.txt

There's no easy solution for bad bots which ignore robots.txt and spoof their UA though.


Replies

breakingcups01/16/2025

Such as OpenAI, who will ignore robots.txt and change their user agent to evade blocks, apparently[1]

1: https://www.reddit.com/r/selfhosted/comments/1i154h7/openai_...

zcase01/16/2025

For those looking, this is the best I've found: https://blog.cloudflare.com/declaring-your-aindependence-blo...

show 1 reply
taikahessu01/16/2025

Thanks, will look into that!