We actually deployed working speech to speech inference that builds on top of vLLM as the backbone. ...

AndreSlavescu • 12/10/2025 • 2 replies • view on HN

We actually deployed working speech to speech inference that builds on top of vLLM as the backbone. The main thing was to support the "Talker" module, which is currently not supported on the qwen3-omni branch for vLLM.

Check it out here: https://models.hathora.dev/model/qwen3-omni

Replies

sosodev • 12/10/2025

Is your work open source?

red2awn • 12/10/2025

Nice work. Are you working on streaming input/output?

➕ show 1 reply

alt Hacker News

Replies