logoalt Hacker News

rohansood15today at 4:38 PM0 repliesview on HN

The paper is about vector quantization, which affects KV cache not model weights/sizes.