You don't think these errors compound? Generated code has 100's of little decisions. Yes, it "usually" works.
Not in my experience. With a proper TDD framework it does better than most programmers at a company who anecdotally have a bug every 2-3 tasks.
LLM’s: sometimes wrong but never in doubt.