As they start to release more proprietary models, I so wish that they partnered with one of the major US hyperscalers to allow using these models through something US-domiciled.
Totally understand why it may not be reasonable or in their best interest (and that the US is _absolutely_ not doing the same reflexively). But it would be lovely to be able to try these out on production workloads in earnest.
Is this one of those ones where they'll drop the huggingface release a week later? Or do we know for sure that this is staying proprietary?
QWEN really hits the sweet spot it's cheap, fast, and actually good.
Looking forward to more open weight releases from Qwen, especially 122B and 397B.
The pattern I trust most is adding a small verification artifact after every external action. Agents usually fail from silent state drift faster than from lack of reasoning depth.
These are very good numbers. I still don’t get why they don’t compare against latest competitor versions in these posts, it’s not like we’re all not going to notice.
The tokenomics and value for capability, context and latency look like they could deliver super competitive offer - what would it take for you to switch??
It is super strange that all last (3?) releases they keep comparing older models such as Opus-4.6.
Any info on pricing and latency?
Does anyone have experience with the Alibaba Cloud Model Studio that serves these qwen models?
[flagged]
[dead]
[dead]
[flagged]
I can't bring myself to use any model that trains or sends telemetry back to my country's primary competitor/adversary. I don't care how much money is saved.
Can anyone check its knowledge base for me? I’m honestly not able to run it and the Qwen models I can run censor information critical towards the Chinese government.
Tiananmen Square is the first place to start.
The non-hallucination rate in AA-omniscience is SOTA, better than Opus 4.7, Gemini 3.1 Pro and GPT5.5! Congrats to the team