> I feel like anyone used AI coding tools before 11/25 and after 1/26 (with frontier models) will say there has been a massive jump in, there is a difference between whether LLM can do a specific task or pass some arguably arbitrary checks by maintainers vs. what the are capable of.
How much of that is the model and how much of that is the tooling built around it? Also why is the tooling, specifically Claude Code, so buggy?