It is interesting to see the "DeepMind" branding completely vanish from the post. This fee...

Fiveplus • today at 5:05 PM • 0 replies • view on HN

It is interesting to see the "DeepMind" branding completely vanish from the post. This feels like the final consolidation of the Google Brain merger. The technical report mentions a new "MoE-lite" architecture. Does anyone have details on the parameter count? If this is under 20B params active, the distillation techniques they are using are lightyears ahead of everyone else.

alt Hacker News