Even if the trend doesn’t continue, the current models are very very good. They’re better than the average programmer in the industry, already.
Maybe at some coding benchmark. Certainly not at actually shipping and maintaining production grade software.
I don't know how anyone who carefully and closely reviews their output could possibly think that. Much of the time their code is fine, but every now and again they make a catastrophic (though often well-hidden) mistake that is so bad that all the tests pass but the codebase will be bricked if enough of those go in. They make such disastrous mistakes frequently enough that a decent-sized codebase can't last for more than 18-24 months.
If the average programmer is this bad, then there must be better-than-average programmers reviewing the code. The problem with agents is that they can produce code at a far higher volume than the average programmer.
Anyway, I don't know how well the average programmer programs, but if you commit agent-generated code without careful review, your codebase will be cooked in a year or two.