alt
Hacker News
rohansood15
•
today at 4:38 PM
•
0 replies
•
view on HN
The paper is about vector quantization, which affects KV cache not model weights/sizes.