logoalt Hacker News

Qwen 3.7 Preview

204 pointsby theanonymousoneyesterday at 4:24 PM79 commentsview on HN

Comments

sleepyeldraziyesterday at 5:32 PM

I don't think I can handle another small model release by qwen, I'm still trying to find the limits of 3.6 27B and they are already threatening us with a new one?

But jokes aside, I love the fast iteration, these are most probably again finetunes on the 3.5 architecture that appear better in internal testing, which is still very nice to see. Putting more and more pressure on the bigger labs to perform better is always a good thing.

show 2 replies
kethinovyesterday at 5:50 PM

Can someone explain what the current state of model benchmarking is? If you try to look up what the best locally runnable model is, you get a bunch of random blog posts using idiosyncratic criteria to rank things seemingly based on one dude's opinion.

Ideally I would love to see a leaderboard with relatively objective ranking criteria that 1. lets you filter by open weight / locally runnable, 2. filter by date of release (nothing older than x), and 3. is agnostic to hardware requirements. I just want to know what the best model is. Let me worry about how I will afford to run it.

I love the llmfit project for seeing what will run on your hardware, but it would be nice to know what I'm missing out on by not having better hardware, thus why objective hardware-agnostic ratings would be helpful.

show 3 replies
bachmeieryesterday at 8:52 PM

I'm not much interested in vibe coding (for those who aren't aware that LLMs have other uses). The specific model I've been using with Ollama is hf.co/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:UD-Q4_K_XL and it's amazing how fast it is on 64 GB of RAM and i5-13400 CPU. No GPU on this computer. Gemma 4 E4B will think for a couple of minutes vs 3-5 seconds for Qwen. It's hard to believe how much you can do with such limited hardware using their models.

rspoerriyesterday at 5:26 PM

I am very interested in seeing new qwen models. Qwen3.6 27b is the first one that can do things and doesnt constantly loose "it's mind" and that can be run on a 3090 with a good context size. But it's sometimes getting into a loop.

show 4 replies
kelsey98765431yesterday at 5:14 PM

https://xcancel.com/Alibaba_Qwen/status/2056403591464984753

> Qwen3.7 Preview lands on Arena !

> Here come Qwen3.7-Max-Preview & Qwen3.7-Plus-Preview. Alibaba now #6 lab in Text, #5 in Vision.

> Can't wait to release Qwen3.7 series models!Stay tuned! @arena

julianlamyesterday at 10:56 PM

Gemma 4 and Qwen 3.6 were when my local inference experiments graduated from toy challenges with much hand holding to actually full day back and forth with good ability to utilise tool calls to discover how things are glued together.

I'm not talking about greenfield dev, I'm talking about interfacing with an existing decade old codebase.

trilogicyesterday at 5:46 PM

Qwen 3.6 35B (finetuned) is so good that it became standard open weights for everyday use. Is not far at all from proprietary models if you give it tools, skills and agents etc, it can actually finish the job. (Thank you Qwen team, appreciated). Using opensource now we can definitely rely to design from scratch very complicated architecture and build pretty fast the full pack. Wish to see Europe AI unleashed, wake up.

show 4 replies
hydra-fyesterday at 5:30 PM

Vision has become totally underappreciated, whereas I believe it brings important advantages to a model

Also, a big caveat in using Qwen models has always been its speech patterns. I do wonder how Google made the Gemma lineup so good at this

Let's hope Alibaba continues to open source its models

show 1 reply
Havocyesterday at 6:57 PM

So glad they’re holding steady on open weights.

At least for now. Worried the Chinese team will change their mind once they have parity

giancarlostoroyesterday at 5:26 PM

There I was waiting on a smaller version of Qwen 3.6 to drop so I can run it on my Mac, and then bam, they drop this.

0xbadcafebeeyesterday at 9:45 PM

I stopped caring about benchmarks at MiniMax M2.5. I no longer want more advanced models. I want cheaper models that don't slow down when everyone else is online.

show 1 reply
satvikpendemyesterday at 7:08 PM

Will they release the large models as open weight too? So far it seems only 35 or 27 B etc models are being released with nothing larger unlike before.

raffael_deyesterday at 8:10 PM

I have a tangential question. Provided that it is correct that current proprietary models are offered at below cost-covering rates (I believe this is a consensus if I'm not mistaken¹); what factor (multiplication) would have to be applied approximately to current rates to reach break even?

¹: I think I read this a couple of times but I'm not sure if correct to begin with. Can this be substantiated based on annual financial reporting or other published business metrics by OpenAI, Anthropic et al.?

alfiedotwtfyesterday at 11:05 PM

The jump from 3.5 to 3.6 was noticeable and set the bar. If they can keep the momentum, I’d pretty much say Qwen and China won the AI wars

mempkoyesterday at 6:06 PM

I love that open weight models are catching up so quickly. Also hilarious how far behind Grok is. I guess demand for Grok must be poor if Anthropic is able to rent resources from xAI.

show 1 reply
Onavoyesterday at 5:26 PM

Where's Grok 4.3 on the leaderboard?

show 2 replies
nubgyesterday at 6:50 PM

lmao at opus 4.7 being a downgrade

show 1 reply
vessenesyesterday at 5:51 PM

Today I learned Meta's new model is preferred to everything but claude. That is .. a real surprise! Congrats to the Meta team.