How excellent for a quantized 27GB model (the Q6_K_L GGUF quantization type uses 8 bits per weight i...

bt1a • 01/21/2025 • 0 replies • view on HN

How excellent for a quantized 27GB model (the Q6_K_L GGUF quantization type uses 8 bits per weight in the embedding and output layers since they're sensitize to quantization)

alt Hacker News