logoalt Hacker News

lxgryesterday at 4:27 PM2 repliesview on HN

> How will Internet Archive operate?

Presumably increasingly less and less effectively, at least if they continue honoring robots.txt and don't implement scraping protection bypass mechanisms.

https://www.theverge.com/news/757538/reddit-internet-archive...


Replies

walskiyesterday at 7:28 PM

IA has not honored robots.txt for the better part of a decade now.

https://blog.archive.org/2017/04/17/robots-txt-meant-for-sea...

show 1 reply
overfeedyesterday at 5:42 PM

Interestingly, the article declares that Cloudflare is uncertain if the Internet Archive respects robots.txt