I’m shocked at the 25M line part! That is a completely unfathomable amount of code for one codebase. I really want to know more about that.
Only 25 million? :) Google had billions a decade ago...
https://research.google/pubs/why-google-stores-billions-of-l...
Right, where is the rest of the code?
My (much smaller than Stripe) company is well over 4.5M at this point, and the graph is very much exponential.
AI has been a huge problem here: the amount of code is just exploding. Quality of the produced code is another matter.
I am more shocked by the "overnight" aspect. I tried running clang-format on the Chromium source (68,281 .cc files, 21 million lines according to wc):
$ find chromium-149.0.7826.1/ -name ".cc" -exec cat {} + | wc 21640925 55715244 833460441
And that took less than 6 minutes on a single E5-2696 v3 from 2014:
$ time find chromium-149.0.7826.1/ -name *.cc | parallel -j 16 clang-format $x>/dev/null
real 0m5.666s user 1m13.964s sys 0m13.373s
That’s orders of magnitude faster, especially if we assume they’re not running their workloads on potatoes like mine. Is Ruby’s syntax really that much more complicated than C++, or is this a tooling problem?