logoalt Hacker News

Aurornistoday at 2:48 PM2 repliesview on HN

> I did not do a very long session

This is always the problem with the 2-bit and even 3-bit quants: They look promising in short sessions but then you try to do real work and realize they’re a waste of time.

Running a smaller dense model like 27B produces better results than 2-bit quants of larger models in my experience.


Replies

ameliustoday at 6:35 PM

> This is always the problem with the 2-bit and even 3-bit quants: They look promising in short sessions but then you try to do real work and realize they’re a waste of time.

It would be nice to see a scientific assessment of that statement.

singpolyma3today at 4:22 PM

Lots of people seem to use 4bit. Do you think that's worth it vs a smaller model in some cases?

show 2 replies