logoalt Hacker News

KVarN: Native vLLM backend for KV-cache quantization by Huawei

51 pointsby theanonymousonetoday at 3:18 PM7 commentsview on HN

Comments

throwa356262today at 3:54 PM

Better performance than TQ and better quality than FP16?

Am I reading this right??

show 3 replies
v3ss0ntoday at 3:53 PM

Why this is not a PR for vLLM ?

show 2 replies
shockemboppertoday at 5:17 PM

[dead]