and can be faster if you can get an MOE model of that
"Mixture-of-experts", AKA "running several small models and activating only a few at a time". Thanks for introducing me to that concept. Fascinating.
(commentary: things are really moving too fast for the layperson to keep up)
All modern models are MoE already, no?
>90% of inference hardware is faster if you run an MOE model.
"Mixture-of-experts", AKA "running several small models and activating only a few at a time". Thanks for introducing me to that concept. Fascinating.
(commentary: things are really moving too fast for the layperson to keep up)