> at Q4_K_M, stock-style quantization is retaining ~99–99.8% of BF16 accuracy That's a tal...

woadwarrior01 • today at 3:54 PM • 0 replies • view on HN

> at Q4_K_M, stock-style quantization is retaining ~99–99.8% of BF16 accuracy

That's a tall claim. By that measure, even NVIDIA's QAD, which is AFAIK is currently SOTA for 4-bit quantization (albeit NVFP4 instead of INT4) would be worse than Q4_K_M RTN quantization. :D

https://arxiv.org/abs/2601.20088

alt Hacker News