Both Simon Willison and Antirez said that using LLMs helped them, so it's kind of perverse to read them and conclude the opposite.
In particular, doing direct comparisons between metrics like that doesn't work. "Lines of code" isn't a good way to measure complexity of the code, and the amount of time it takes to review the code will vary quite a bit based on the use case.
There's a lot of diversity in what kind of code people write and just because it worked for someone else doesn't mean it will work for the kinds of problems you solve. It's anecdotal evidence that someone else found it useful, your mileage may vary.
Simon often says that its LLMs help him "write productive code", but most of the code he shows are python libs doing menial tasks. That's fine for tooling, etc, which is sometimes useful.
It would absolutely NOT work for production-code with critical concurrency / embedded / real-time stuff
The relevant question is whether it helped them 10x or anywhere close to what AI is now being sold as (supposedly even replacing software developers' jobs altogether and one-shotting complete products from a single prompt), or it's just acting as a kind of glorified autocomplete. So far we're clearly seeing the latter based on what both Simon Willison and Antirez are referencing.