I regularly has it produce 10k+ lines of code that is working and passing extensive test suites. If you give it a prompt and no agent loop and test harness, then sure, you'll need to waste your time babysitting it.