logoalt Hacker News

redroveyesterday at 7:30 PM4 repliesview on HN

It’s about bang for buck. That high a score for 5B params is pretty good, nigh unbelievable a short while ago.

It is my belief that smaller models will get better and better, and even cloud SOTA models will shrink.

Yet another reason the current buildout will feel like the railroads.


Replies

necubiyesterday at 7:59 PM

It's 5B active params in MoE, not 5B total params (total is 137B).

bgirardyesterday at 8:54 PM

> It’s about bang for buck.

Hard to know when they don't give the price per token. Presumably it will be comparable to a low-mid range model in terms of price. But otherwise their 'Ideal Zone' is meaningless without factoring in the price per token. I don't how much tokens are being used, that's an implementation detail to me. I care about price / performance / latency.

Flere-Imsahoyesterday at 7:37 PM

Yeah the future is probably a number of highly specialised small models you can run on your own hardware rather than massive frontier models in the cloud.

That's what I'm betting on anyway.

show 3 replies
dist-epochyesterday at 7:50 PM

The SOTA models will not shrink, because the problems will get bigger, from "write me a C compiler" to "clone Stripe business and run it".