logoalt Hacker News

MASNeotoday at 7:11 AM3 repliesview on HN

I wish there would be more of this research to speed things up rather than building ever larger models


Replies

wongarsutoday at 9:23 AM

Notice how all the major AI companies (at least the ones that don't do open releases) stopped telling us how many parameters their models have. Parameter count was used as a measure for how great the proprietary models were until GPT3, then it suddenly stopped.

And how inference prices have come down a lot, despite increasing pressure to make money. Opus 4.6 is $25/MTok, Opus 4.1 was $75/MTok, the same as Opus 4 and Opus 3. OpenAI's o1 was $60/MTok, o1 pro $600/MTok, gpt-5.2 is $14/MTok and 5.2-pro is $168/MTok.

Also note how GPT-4 was rumored to be in the 1.8T realm, and now Chinese models in the 1T realm can match or surpass it. And I doubt the Chinese have a monopoly on those efficiency improvements

I doubt frontier models have actually substantially grown in size in the last 1.5 years, and potentially have a lot fewer parameters than the frontier models of old

show 5 replies
nltoday at 7:20 AM

Why not both?

Scaling laws are real! But they don't preclude faster processing.

eliftoday at 11:56 AM

It's the same thing. Quantize your parameters? "Bigger" model runs faster. MOE base model distillation? "Bigger" model runs as smaller model.

There is no gain for anyone anywhere by reducing parameter count overall if that's what you mean. That sounds more like you don't like transformer models than a real performance desire