The difference is you can evaluate a small bit of the output of a human or a team of humans and expect all their other code to be roughly in the same ballpark of quality.
An LLM can’t be trusted to produce code and make higher level project structure choices of the same quality at all times, because it can’t be trusted at all - trust is for deterministic systems. But still it begs us to trust it. Every prompt that yields good results sets us up to expect good results, so we get lazy - and then the next prompt it spews out garbage.
As long as the odds are good enough (and/or you know the distribution), there is nothing wrong in relying on and profiting from stochastic systems despite not every outcome being positive. What matters is the sum of outcomes, not the individual ones.
It means you need to be able to handle failure, but you should always have a good grip on how to correct if you intend to set things out in the real world which messes up everything always anyways.