logoalt Hacker News

redman25yesterday at 2:45 PM0 repliesview on HN

For consumers, there's little reason to run unquanted, especially for large models which take less of a hit from quantization. I'm running a 200b model at Q3 with very little degradation. A 1000b model would see even less change.