logoalt Hacker News

zozbot234today at 5:02 PM1 replyview on HN

Qwen 3.6 is a toy compared to DeepSeek V4 Flash or Pro. These models can now run on Apple Silicon hardware with as little as 32GB RAM for the Flash (with 2-bit quant, which is still quite capable) using SSD offloading, with just-about-reasonable performance for interactive use, and far better performance on longer contexts than Qwen (due to the more efficient KV cache/attention mechanisms in DeepSeek).

Very significant improvements may be viable for unattended inference via large-scale batches, which can reuse sparse experts and thereby mask some of the latency involved - this is quite unique to DeepSeek, again due to its efficient KV cache.


Replies

greenavocadotoday at 5:16 PM

Qwen 3.6 27B still curb stomps Deepseek V4 in coding

show 1 reply