logoalt Hacker News

o175today at 6:45 AM2 repliesview on HN

The 158,000x slowdown on SQLite is the number that matters here, not whether it can parse C correctly. Parsing is the solved problem — every CS undergrad writes a recursive descent parser. The interesting (and hard) parts of a compiler are register allocation, instruction selection, and optimization passes, and those are exactly where this falls apart.

That said, I think the framing of "CCC vs GCC" is wrong. GCC has had thousands of engineer-years poured into it. The actually impressive thing is that an LLM produced a compiler at all that handles enough of C to compile non-trivial programs. Even a terrible one. Five years ago that would've been unthinkable.

The goalpost everyone should be watching isn't "can it match GCC" — it's whether the next iteration closes that 158,000x gap to, say, 100x. If it does, that tells you something real about the trajectory.


Replies

terafloptoday at 7:15 AM

The part of the article about the 158,000x slowdown doesn't really make sense to me.

It says that a nested query does a large number of iterations through the SQLite bytecode evaluator. And it claims that each iteration is 4x slower, with an additional 2-3x penalty from "cache pressure". (There seems to be no explanation of where those numbers came from. Given that the blog post is largely AI-generated, I don't know whether I can trust them not to be hallucinated.)

But making each iteration 12x slower should only make the whole program 12x slower, not 158,000x slower.

Such a huge slowdown strongly suggests that CCC's generated code is doing something asymptotically slower than GCC's generated code, which in turn suggests a miscompilation.

I notice that the test script doesn't seem to perform any kind of correctness testing on the compiled code, other than not crashing. I would find this much more interesting if it tried to run SQLite's extensive test suite.

wrxdtoday at 7:19 AM

This thing has likely all of GCC, clang and any other open source C compiler in its training set.

It could have spotted out GCC source code verbatim and matched its performance.

show 3 replies