KV cache compression, so how much memory the model needs to use for extending its context. Does not ...

x_may • today at 11:48 AM • 0 replies • view on HN

KV cache compression, so how much memory the model needs to use for extending its context. Does not affect the weight size.

alt Hacker News