because how huge glm 4.7 is

disiplus • yesterday at 7:37 PM • 1 reply • view on HN

because how huge glm 4.7 is https://huggingface.co/zai-org/GLM-4.7

Replies

Except this is GLM 4.7 Flash which has 32B total params, 3B active. It should fit with a decent context window of 40k or so in 20GB of ram at 4b weights quantization and you can save even more by quantizing the activations and KV cache to 8bit.

➕ show 1 reply

alt Hacker News

Replies