logoalt Hacker News

ein0p01/21/20250 repliesview on HN

My models are both 4 bit. But yeah, that could be - small models are much worse at tolerating quantization. That's why people use LoRA to recover the accuracy somewhat even if they don't need domain adaptation.