> Lastly, there is a massive difference in capabilities, determinism, and error handling between ...

stymaar • today at 3:31 PM • 3 replies • view on HN

> Lastly, there is a massive difference in capabilities, determinism, and error handling between 5T SOTA models like Opus

What's your source for Opus being a 5T model?

> and tiny distillations from DeepSeek that perform well only in benchmarks.

I don't think you know what you're talking about. Local models aren't “distillations from Deepseek”.

And they don't perform well “only in benchmarks”, Qwen 3.6 is a very decent model (obviously it's not Opus, but it's also much faster and speed is a quality of its own).

Replies

Chyzwar • today at 6:56 PM

https://arxiv.org/abs/2604.24827

From this paper

➕ show 1 reply

gpugreg • today at 3:41 PM

> What's your source for Opus being a 5T model?

Elon Musk tweeted that Grok is 0.5T or 1/10th the size of Opus. https://xcancel.com/elonmusk/status/2042123561666855235#m

While this source's reliability is certainly debatable, the size matches the results of this paper, in which researchers estimated the parameter count from model knowledge. https://01.me/research/ikp/

➕ show 1 reply

layer8 • today at 3:47 PM

> What's your source for Opus being a 5T model?

Probably Elon Musk: https://eu.36kr.com/en/p/3760679047267075

➕ show 1 reply

alt Hacker News

Replies