I feel like anyone used AI coding tools before 11/25 and after 1/26 (with frontier models) will say there has been a massive jump in, there is a difference between whether LLM can do a specific task or pass some arguably arbitrary checks by maintainers vs. what the are capable of.
We still have tons of gaps about how to build and maintain code with AI, but LLM themselves getting better at an unbelievable pace, even with this kind of data analysis I’m surprised anyone can even question it.
> I feel like anyone used AI coding tools before 11/25 and after 1/26 (with frontier models) will say there has been a massive jump in, there is a difference between whether LLM can do a specific task or pass some arguably arbitrary checks by maintainers vs. what the are capable of.
How much of that is the model and how much of that is the tooling built around it? Also why is the tooling, specifically Claude Code, so buggy?