logoalt Hacker News

varun_chyesterday at 8:46 PM5 repliesview on HN

I’m shocked at the 25M line part! That is a completely unfathomable amount of code for one codebase. I really want to know more about that.


Replies

phoydyesterday at 11:24 PM

I am more shocked by the "overnight" aspect. I tried running clang-format on the Chromium source (68,281 .cc files, 21 million lines according to wc):

$ find chromium-149.0.7826.1/ -name ".cc" -exec cat {} + | wc 21640925 55715244 833460441

And that took less than 6 minutes on a single E5-2696 v3 from 2014:

$ time find chromium-149.0.7826.1/ -name *.cc | parallel -j 16 clang-format $x>/dev/null

real 0m5.666s user 1m13.964s sys 0m13.373s

That’s orders of magnitude faster, especially if we assume they’re not running their workloads on potatoes like mine. Is Ruby’s syntax really that much more complicated than C++, or is this a tooling problem?

show 2 replies
bruckieyesterday at 9:34 PM

Only 25 million? :) Google had billions a decade ago...

https://research.google/pubs/why-google-stores-billions-of-l...

show 1 reply
jsnellyesterday at 8:55 PM

Right, where is the rest of the code?

mr_mitmyesterday at 9:01 PM

They're up to 42 million now, as per the article

show 1 reply
deathanatosyesterday at 10:57 PM

My (much smaller than Stripe) company is well over 4.5M at this point, and the graph is very much exponential.

AI has been a huge problem here: the amount of code is just exploding. Quality of the produced code is another matter.

show 1 reply