Why not? Run it with vLLM latest and enable 4bit quantization with bnb, and it will quantize the original safetensors on the fly and fit your vram.
because how huge glm 4.7 is https://huggingface.co/zai-org/GLM-4.7
because how huge glm 4.7 is https://huggingface.co/zai-org/GLM-4.7