logoalt Hacker News

dist-epochtoday at 5:10 PM0 repliesview on HN

The un-quantized MoE outperforms it.

But between same (V)RAM requirement 4 bit 26B-A3B and 8 bit 12B it's unclear which one will win, especially given one is MoE and the other dense.

All the launch benchmarks are at 16 bit.