I'm using a local model. The code gen is never fast beyond the first few context. As the context grows, it slows down. It's basically it's own self limiting process. When it starts doing things, the threshold of lethargy lowers and triggers me to 'do it myself'; especially, I've developed the understanding of knowledge where it starts doing stupid things and that's valuable.
There must be an epistemic problem with just how fast these SOTA models run. I don't think it's just that my local model is dumber, I think it's more that the speed of token gen trains my brain with different expectations. There's no way it'll just generate hundreds of files by itself. When it can via a opencode loop with thought files, letting it run for a day is the only way you get that.