> In the case of gpt-oss 120B that would means sqrt(5*120)=24B. That's actually in line wi...

selcuka • last Monday at 12:30 PM • 0 replies • view on HN

> In the case of gpt-oss 120B that would means sqrt(5*120)=24B.

That's actually in line with what I had (unscientifically) expected. Claude Sonnet 4 seems to agree:

> The most accurate approach for your specific 120B MoE (5.1B active) would be to test it empirically against dense models in the 10-30B range.

alt Hacker News