logoalt Hacker News

jsheardlast Thursday at 3:15 PM3 repliesview on HN

For the "good" bots which at least respect robots.txt you can use this list to get ahead of them before they pummel your site.

https://github.com/ai-robots-txt/ai.robots.txt

There's no easy solution for bad bots which ignore robots.txt and spoof their UA though.


Replies

breakingcupslast Thursday at 8:56 PM

Such as OpenAI, who will ignore robots.txt and change their user agent to evade blocks, apparently[1]

1: https://www.reddit.com/r/selfhosted/comments/1i154h7/openai_...

zcaselast Thursday at 10:50 PM

For those looking, this is the best I've found: https://blog.cloudflare.com/declaring-your-aindependence-blo...

show 1 reply
taikahessulast Thursday at 3:18 PM

Thanks, will look into that!