> The math and coding part is impressive but the agentic one is not.
I think this is very important to eventually become a viable replacement for coding models. Because most of the time coding harnesses are leveraging tool calls to gather the context and then write a solution.
I am hopeful, that one day we can replace Claude and OpenAI models with local SOTA LLMs
That's absolutely possible, its just as we move towards more advancement, We'll soon see Small models being smart enough to not be judged by parameter count but their reasoning and intelligence. You can see examples like Qwen 3.6 27B.
It's pretty close already. Check qwen3.6 27b if you haven't already. People are vibe and agentic coding with it on a single GPU.
It is more finicky than Claude but if you hand hold it a bit it's crazy.