Quite a niche release. The MoE outperforms it on score and will likely be faster thanks to lower act...

Havoc • today at 5:05 PM • 1 reply • view on HN

Quite a niche release. The MoE outperforms it on score and will likely be faster thanks to lower active weights. So this really only makes sense for specific ram constrained applications that can’t fit a quantized MoE

Replies

dist-epoch • today at 5:10 PM

The un-quantized MoE outperforms it.

But between same (V)RAM requirement 4 bit 26B-A3B and 8 bit 12B it's unclear which one will win, especially given one is MoE and the other dense.

All the launch benchmarks are at 16 bit.

alt Hacker News

Replies