Been using Qwen 3.6 35B and Gemma 4 26B on my M4 MBP, and while it’s no Opus, it does 95% of what I need which is already crazy since everything runs fully local.
You've got me curious. Two questions if I may:
- What kind of tasks/work?
- How is either Qwen/Gemma wired up (e.g. which harness/how are they accessed)?
Or to phase another way; what does your workflow/software stack look like?
It’s good enough that I’ve been having codex automate itself out of a job by delegating more and more to it.
Very excited for the 122b version as the throughput is significantly better for that vs the dense 27b on my m4.
can you expand more on what you mean by 95%?
There are 2 aspects I am interested in:
1. accuracy - is it 95% accuracy of Opus in terms of output quality (4.5 or 4.6)?
2. capability-wise - 95% accuracy when calling your tools and perform agentic work compared to Opus - e.g. trip planning?
Do you use it with ollama? Or something else?