Explain to me how you self-host a git repo which is accessed millions of time a day from CI jobs pulling packages.
Let's assume 3 million. That's about 30 per second.
From compute POV you can serve that with one server or virtual machine.
Bandwidth-wise, given a 100 MB repo size, that would make it 3.4 GB/s - also easy terrain for a single server.
These days, people solve similar problems by wrapping their data in an OCI container image and distribute it through one of the container registries that do not have a practically meaningful pull rate limit. Not really a joke, unfortunately.
FTFY:
Explain to me how you self-host a git repo without spending any money and having no budget which is accessed millions of time a day from CI jobs pulling packages.
Is running the git binary as a read-only nginx backend not good enough? Probably not. Hosting tarballs is far more efficient.
You git init —-bare on a host with sufficient resources. But I would recommend thinking about your CI flow too.
I'm not sure whether this question was asked in good faith, but is actually a damn good one.
I've looked into self hosting and git repo that has horizontal scalability, and it is indeed very difficult. I don't have the time to detail it in a comment here, but for anyone who is curious it's very informative to look at how GitLab handled this with gitaly. I've also seen some clever attempts to use object storage, though I haven't seen any of those solutions put heavily to the test.
I'd love to hear from others about ideas and approaches they've heard about or tried
https://gitlab.com/gitlab-org/gitaly