Even with relatively simple things, frontier models get me about 90% of the way - and this is without evaluating how good that 90% actually is. It's the last 10% that the model fucking sucks at. And it's often the simplest things. It takes a lot of tokens and a lot of time to cajole the AI to get that last 10% working. And even then, I've just given up and had to go read the slop and fix the bug myself because it become so frustrating.