logoalt Hacker News

2ndorderthoughtyesterday at 7:59 PM3 repliesview on HN

Have you tried the latest qwen3.6 models?

For most of my questions and 8-9b model works great. Upshot is not having chatgpt/meta sell my data or target me with random thoughts later.


Replies

entropeyesterday at 8:40 PM

I let Qwen3.6-27B chew on a bug all last night. It choked at some point and stopped responding (probably a context overflow before pi-coding-agent could compact it). Claude Sonnet 4.6 found and fixed the bug in under 10 minutes.

Qwen3.6 is pretty amazing for a 27B model, but it's not hard to run into its limits. With a Radeon R9700 and unsloth's 6-bit quantization, I get ~20 TPS and 110k context, so it can do a fair bit quickly.

show 1 reply
datadrivenangelyesterday at 8:28 PM

Inference speed is still slow in a meaningfully different way. The models are good, but not great, and much slower, which for coding means a 2-3 minute task with claude code and opus takes an hour and has a higher chance of being wrong.

show 1 reply
ekjhgkejhgkyesterday at 8:19 PM

We're in the same boat. I would rather have NO llm, than an llm that collects my data (which you should assume is all of them, unless you've been asleep for the last 20 years).

Fortunately, I don't have to pick one or the other - instead I run Qwen 3.6 35B A3B. It's a bit slow with my 8gb GPU (I'm in the process of getting a bigger one) but again, to me the choice isn't "what's the best I can get", it's "what's the best local I can get".