logoalt Hacker News

pdycyesterday at 2:12 PM2 repliesview on HN

can you elaborate? you can use quantized version, would context still be an issue with it?


Replies

abhikul0yesterday at 2:20 PM

A usable quant, Q5_KM imo, takes up ~26GB[0], which leaves around ~6-7GB for context and running other programs which is not much.

[0] https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF?show_fil...

nickthegreekyesterday at 2:15 PM

context is always an issue with local models and consumer hardware.

show 1 reply