5-10% accuracy is like the difference between a usable model, and unusable model.

cphoover • today at 5:02 PM • 2 replies • view on HN

Replies

Definitely could be, but in the time I spent talking to the 4-bit models in comparison to the 16-bit original it seemed surprisingly capable still. I do recommend benchmarking quantized models at the specific tasks you care about.

amelius • today at 7:36 PM

Yes I was wondering why they mentioned those numbers without mentioning their practical significance.

alt Hacker News

Replies