The code that LLMs produce is just average IMO. I wouldn’t call myself an authority on clean code but I can tell when code is well structured. I prefer my hand written code over Claude or GPT’s every time. I once did an experiment where I generated a spec from a project I’d already written, then had an LLM blindly reimplement it from the spec, and compared code. The LLM’s version looked like vomit.
Agree, however in some cases avg code is good enough, especially when refactoring it is just a little attention and more tokens.