Better keep the KV cache in full precision

ggerganov • today at 6:53 AM • 1 reply • view on HN

Wow.. the GOAT himself.. thank you sooo much for creating llama.cpp ... will re-deploy with full kv cache once requests stop coming.

alt Hacker News