logoalt Hacker News

londons_exploretoday at 4:03 AM1 replyview on HN

Most of these unauthenticated requests are read-only.

All of public github is only 21TB. Can't they just host that on a dumb cache and let the bots crawl to their heart's content?


Replies

yorwbatoday at 4:49 AM

I guess you're getting the size from the Arctic Code Vault? https://github.blog/news-insights/company-news/github-archiv... That was 5 years ago and is presumably in git's compressed storage format. Caching the corresponding GitHub HTML would take significantly more.