What is the min VRAM this can run on given it is MOE?

tmaly • today at 5:46 PM • 1 reply • view on HN

Replies

Fwiw, with its predecessor's Qwen3.5-35B-A3B-Q6_K.gguf, on a laptop's 6 GB VRAM and 32 GB RAM, with default llama.cpp settings, I get 20 t/s generation.

➕ show 1 reply

alt Hacker News

Replies