I get the right answer on the 8B model too. It could be the quantized version failing?

cbo100 • 01/21/2025 • 1 reply • view on HN

I get the right answer on the 8B model too.

It could be the quantized version failing?

Replies

ein0p • 01/21/2025

My models are both 4 bit. But yeah, that could be - small models are much worse at tolerating quantization. That's why people use LoRA to recover the accuracy somewhat even if they don't need domain adaptation.

alt Hacker News

Replies