logoalt Hacker News

ssijaktoday at 11:32 AM1 replyview on HN

For my grug brain can somebody translate this to ELIgrug terms?

Does this mean I would be able to run 500b model on my 48gb macbook without loosing quality?


Replies

x_maytoday at 11:48 AM

KV cache compression, so how much memory the model needs to use for extending its context. Does not affect the weight size.