From my experience, qwen3-coder is way better. I only have gpt-oss:20b installed to make a few more tests but I give it a program to make a summary of what it does and qwen3 just works in a few seconds, while gpt-oss was cancelled after 5 minuts... doing nothing.
So I just use qwen3. Fast and great ouput. If for some reason I don't get what I need, I might use search engines or Perplexity.
I have a 10GB 3080 and Ryzen 3600x with 32gb of RAM.
Qwen3-coder is amazing. Best I used so far.
I've been using lightly gpt-oss-20b but what I've found is that for smaller (single sentence) prompts it was easy enough to have it loop infinitely. Since I'm running it with llama.cpp I've set a small repetition penalty and haven't encountered those issues since (I'm using it a couple of times a day to analyze diffs, so I might have just gotten lucky since)
The 20B version doesn't fit in 10GB. That might explain some issues?
Are you using this in an agentic way or in a copy and paste and “code this” single input single output way?
I’d like to know how far the frontier models are from the local for agentic coding.
What Qwen3-Coder model are you using? Quantized or not?
Asking because I'm looking for a good model that fits in 12GB VRAM.
Qwen3 coder 480B is quite good and on par with Sonnet 4. It’s the first time I realized the Chinese models are probably going to eclipse US-based models pretty soon, at least for coding.