and can be faster if you can get an MOE model of that

hasperdi • yesterday at 8:31 PM • 3 replies • view on HN

"Mixture-of-experts", AKA "running several small models and activating only a few at a time". Thanks for introducing me to that concept. Fascinating.

(commentary: things are really moving too fast for the layperson to keep up)

➕ show 2 replies

miohtama • yesterday at 10:12 PM

All modern models are MoE already, no?

➕ show 1 reply

bigyabai • yesterday at 9:33 PM

>90% of inference hardware is faster if you run an MOE model.

alt Hacker News