Deepseek Flash v4 actually runs on 128Gb systems (about 14 tok/sec). Antirez created a fabulous 2 bit quant and a highly tuned LLM server
https://github.com/antirez/ds4