have you found a model that does this with usable speeds on an M2/M3?
On a M4 MBP ollama's qwen3.5:35b-a3b-coding-nvfp4 runs incredibly fast when in the claude/codex harness. M2/M3 should be similar.
It's incomparably faster than any other model (i.e. it's actually usable without cope). Caching makes a huge difference.
On a M4 MBP ollama's qwen3.5:35b-a3b-coding-nvfp4 runs incredibly fast when in the claude/codex harness. M2/M3 should be similar.
It's incomparably faster than any other model (i.e. it's actually usable without cope). Caching makes a huge difference.