logoalt Hacker News

Updated rate limits for unauthenticated requests

79 pointsby xenalast Friday at 2:11 PM98 commentsview on HN

https://github.com/orgs/community/discussions/159123

https://github.com/orgs/community/discussions/157887


Comments

TheNewsIsHereyesterday at 1:41 PM

I don’t think the publication date (May 8, as I type this) on the GitHub blog article is the same date this change became effective.

From a long-term, clean network I have been consistently seeing these “whoa there!” secondary rate limit errors for over a month when browsing more than 2-3 files in a repo.

My experience has been that once they’ve throttled your IP under this policy, you cannot even reach a login page to authenticate. The docs direct you to file a ticket (if you’re a paying customer, which I am) if you consistently get that error.

I was never able to file a ticket when this happened because their rate limiter also applies to one of the required backend services that the ticketing system calls from the browser. Clearly they don’t test that experience end to end.

gnabgiblast Friday at 4:12 PM

60 req/hour for unauthenticated users

5000 req/hour for authenticated - personal

15000 req/hour for authenticated - enterprise org

According to https://docs.github.com/en/rest/using-the-rest-api/rate-limi...

I bump into this just browsing a repo's code (unauth).. seems like it's one of the side effects of the AI rush.

show 4 replies
PaulDavisThe1styesterday at 4:53 PM

Several people in the comments seem to be blaming Github for taking this step for no apparent reason.

Those of us who self-host git repos know that this is not true. Over at ardour.org, we've passed the 1M-unique-IP's banned due to AI trawlers sucking our repository 1 commit at a time. It was killing our server before we put fail2ban to work.

I'm not arguing that the specific steps Github have taken are the right ones. They might be, they might not, but they do help to address the problem. Our choice for now has been based on noticing that the trawlers are always fetching commits, so we tweaked things such that the overall http-facing git repo works, but you cannot access commit-based URLs. If you want that, you need to use our github mirror :)

show 4 replies
hardwaresoftontoday at 2:00 AM

Does it seem to anyone like eventually the entire internet will be login only?

At this point knowledge seems to be gathered and replicated to great effect and sites that either want to monetize their content OR prevent bot traffic wasting resources seem to have one easy option.

show 1 reply
joramsyesterday at 3:58 PM

> These changes will apply to operations like cloning repositories over HTTPS, anonymously interacting with our REST APIs, and downloading files from raw.githubusercontent.com.

Or randomly when clicking through a repository file tree. The first time I hit a rate limit was when I was skimming through a repository on my phone, and about the 5th file I clicked I was denied and locked out. Not for a few seconds either, it lasted long enough that I gave up on waiting then refreshing every ~10 seconds.

show 1 reply
thih9yesterday at 7:23 PM

What does “secondary” stand for here in the error message?

> You have exceeded a secondary rate limit.

Edit and self-answer:

> In addition to primary rate limits, GitHub enforces secondary rate limits

(…)

> These secondary rate limits are subject to change without notice. You may also encounter a secondary rate limit for undisclosed reasons.

https://docs.github.com/en/rest/using-the-rest-api/rate-limi...

londons_exploretoday at 4:03 AM

Most of these unauthenticated requests are read-only.

All of public github is only 21TB. Can't they just host that on a dumb cache and let the bots crawl to their heart's content?

show 1 reply
croemertoday at 12:52 AM

The blog post is tagged with "improvement" - ironic for more restrictive rate limits.

Also, neither the new nor the old rate limits are mentioned.

jrochkind1yesterday at 9:01 PM

Wow, I'm realizing this applies to even browsing files in the web UI without being logged in, and the limits are quite low?

This rather significantly changes the place of github hosted code in the ecosystem.

I understand it is probably a response to the ill-behaved decentralized bot-nets doing mass scraping with cloaked user-agents (that everyone assumes is AI-related, but I think it's all just speculation and it's quite mysterious) -- which is affecting most of us.

The mystery bot net(s) are kind of destroying the open web, by the counter-measures being chosen.

jhggyesterday at 9:49 PM

The truth is this won't actually stop AI crawlers and they'll just move to a large residential proxy pool to work around it. Not sure what the solution is honestly.

show 1 reply
pogueyesterday at 6:34 AM

I assume they're trying to keep ai bots from strip mining the whole place.

Or maybe your IP/browser is questionable.

show 5 replies
jrochkind1yesterday at 8:57 PM

Did I miss where it says what the new rate limits are? Or are they secret?

spacephysicsyesterday at 9:14 PM

Probably to throttle scraping from AI competitors, and have them pay for the privilege as many other services have been doing

jarofgreenyesterday at 6:51 AM

Also https://github.com/orgs/community/discussions/157887 "Persistent HTTP 429 Rate Limiting on *.githubusercontent.com Triggered by Accept-Language: zh-CN Header" but the comments show examples with no language headers.

I encountered this too once, but thought it was a glitch. Worrying if they can't sort it.

Euphorbiumyesterday at 7:09 AM

I remember getting this error a few months ago, this does not seem like a temporary glitch. They dont want llm makers to slurp all the data.

show 1 reply
mmscyesterday at 10:25 PM

Even with authenticated requests, viewing a pull request and adding `.diff` to the end of the URL is currently ratelimited at 1 request per minute. Incredibly low, IMO.

show 1 reply
knowitnonetoday at 12:13 AM

you mean you want to better track users

trallnagyesterday at 5:11 PM

Good that tools like Homebrew that heavily rely on GitHub usually support environment variables like GITHUB_TOKEN

stevekempyesterday at 5:14 PM

Once again people post in the "community", but nobody official replies; these discussion-pages are just users shouting into the void.

watermelon0yesterday at 6:53 AM

Time for Mozilla (and other open-source projects) to move repositories to sourcehut/Codeberg or self-hosted Gitlab/Forgejo?

show 1 reply
xnxyesterday at 2:07 PM

It sucks that we've collectively surrendered the urls to our content to centralized services that can change their terms at any time without any control. Content can always be moved, but moving the entire audience associated with a url is much harder.

show 1 reply
InfiniteLoupyesterday at 3:59 PM

How would this affect Go dependencies?

show 1 reply
jarofgreenyesterday at 7:10 AM

https://github.com/orgs/community/discussions/157887 This has been going on for weeks and is clearly not a simple mistake.

show 3 replies
radicalityyesterday at 7:03 AM

Just tried it on chrome incognito on iOS and do hit this 429 rate limit :S That sucks, it’s already bad enough when GitHub started enforcing login to even do a simple search.