logoalt Hacker News

skiing_crawlingyesterday at 6:42 PM0 repliesview on HN

I used to run qwen3.5 27b Q4_k_M on a single 3090 with these llama-server flags successfully: `-ngl 99 -c 262144 -fa on --cache-type-k q4_0 --cache-type-v q4_0`