logoalt Hacker News

gardnrlast Wednesday at 5:37 PM7 repliesview on HN

This is a 30B parameter MoE with 3B active parameters and is the successor to their previous 7B omni model. [1]

You can expect this model to have similar performance to the non-omni version. [2]

There aren't many open-weights omni models so I consider this a big deal. I would use this model to replace the keyboard and monitor in an application while doing the heavy lifting with other tech behind the scenes. There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.

1. https://huggingface.co/Qwen/Qwen2.5-Omni-7B

2. https://artificialanalysis.ai/models/qwen3-30b-a3b-instruct


Replies

red2awnlast Wednesday at 7:35 PM

This is a stack of models:

- 650M Audio Encoder

- 540M Vision Encoder

- 30B-A3B LLM

- 3B-A0.3B Audio LLM

- 80M Transformer/200M ConvNet audio token to waveform

This is a closed source weight update to their Qwen3-Omni model. They had a previous open weight release Qwen/Qwen3-Omni-30B-A3B-Instruct and a closed version Qwen3-Omni-Flash.

You basically can't use this model right now since none of the open source inference framework have the model fully implemented. It works on transformers but it's extremely slow.

olafuralast Wednesday at 6:05 PM

Looks like it's not open source: https://www.alibabacloud.com/help/en/model-studio/qwen-omni#...

show 1 reply
gardnrlast Wednesday at 5:56 PM

I can't find the weights for this new version anywhere. I checked modelscope and huggingface. It looks like they may have extended the context window to 200K+ tokens but I can't find the actual weights.

show 1 reply
tensegristlast Wednesday at 6:22 PM

> There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.

last i checked (months ago) claude used to do this

andy_ppplast Thursday at 9:54 AM

Haha, you could hear how it’s mind thinks, maybe by putting a lot of reverb on the thinking tokens or some other effect…

pliptlast Wednesday at 7:40 PM

I dont think the Flash model discussed in the article is 30B

Their benchmark table shows it beating Qwen3-235B-A22B

Does "Flash" in the name of a Qwen model indicate a model-as-a-service and not open weights?

show 1 reply
andy_xor_andrewlast Wednesday at 6:37 PM

> This is a 30B parameter MoE with 3B active parameters

Where are you finding that info? Not saying you're wrong; just saying that I didn't see that specified anywhere in the linked page, or on their HF.

show 1 reply