Recently I had switched to OpenCode to try out many of the Non-US-Frontier-Labs models. My unexpected favorite model to use was Mercury (a diffusion model). Not because it was “smart” but because it was stupid fast. It was more of a pair-programming experience instead of the SOTA agentic experience of prompting and waiting. Honestly, it was also way more fun and brought back some of the pre-AI coding experience while still getting some benefits of AI. It felt less of a slot machine where you prompt, wait, and hope it went in the right direction. It made me even use the tiny models like Gemini Flash Lite and GPT Mini/Nano more too.
Anyways, so excited for an open-weight model and I hope it performs well. I’ll be testing this ASAP.
If you can run your tests fast and cheaply, and have metrics that show what bad/sloppy code is that are cheap & fast to generate, a worse fast model can outperform a far better far slower model if you value time...
I've had pretty good success with LLMs after putting in place metrics to measure true complexity (not cyclomatic), and automatically pushing back everything until the added complexity is within reason for the feature.
Could you say more about how you use it? What does your workflow look like?
So you're making smaller edits?