logoalt Hacker News

Aurornisyesterday at 3:36 PM2 repliesview on HN

The benchmarks are from the unquantized model they release.

This will only run on server hardware, some workstation GPUs, or some 128GB unified memory systems.

It’s a situation where if you have to ask, you can’t run the exact model they released. You have to wait for quantizations to smaller sizes, which come in a lot of varieties and have quality tradeoffs.


Replies

bityardyesterday at 4:42 PM

This would likely run fine in just 96 GB of VRAM, by my estimation. Well within the ability of an enthusiastic hobbyist with a few thousand dollars of disposable income.

Quantizations are already out: https://huggingface.co/unsloth/Qwen3.6-27B-GGUF