logoalt Hacker News

justacatbottoday at 3:05 PM0 repliesview on HN

The quality degradation at 2-bit is a real issue. For actual work tasks, a well-tuned 30B at 4-bit usually outperforms a 70B+ at 2-bit in my experience. The expert reduction on top of that compounds things - you're essentially running a fairly different model. Still interesting to see the upper bound of what consumer hardware can attempt, even if the result isn't production-ready.