> I'm a SWE who's been using coding agents daily for the last 6 months and I'm still skeptical.
What improvements have you noticed over that time?
It seems like the models coming out in the last several weeks are dramatically superior to those mid-last year. Does that match your experience?
Yes, it matches my experience. Now I can throw tasks at the agent and have it write a full PR, with tests, good summary. Or it can review things and make good suggestions that a casual reviewer or non-expert would have missed. It can also take a bunch of logs as input, find the issue, fix the code. I can't deny it's impressive and useful.
What I'm still skeptical about is how much more productive it makes us. In my case, coding is maybe 50% of my job, and I work on complex and novel systems. The agent gives me the illusion I don't need to think anymore, but it's not the case. Agents slow me down in many cases too, I'm not learning and improving as I used to.
Not the grandparent, but I've used most of the OpenAI models that have been released in the last year. Out of all of them, o3 was the best at the programming tasks I do. I liked it a lot more than I like GPT 5.2 Thinking/Pro. Overall, I'm not at all convinced that models are making forward progress in general.