I should clarify that I'm referring generically to the types of quantizations used in local LLM...

Aurornis • yesterday at 8:24 PM • 0 replies • view on HN

I should clarify that I'm referring generically to the types of quantizations used in local LLM inference, including those from Unsloth.

Nobody actually quantizes every layer to Q4 in a Q4 quant.

alt Hacker News