logoalt Hacker News

re-thctoday at 8:24 AM2 repliesview on HN

> That's a tautology. People think chinese models are 10x more efficient because they're 10x cheaper

They do have different infrastructure / electricity costs and they might not run on nvidia hardware.

It's not just the models.


Replies

jychangtoday at 8:34 AM

Except there are providers that serve both chinese models AND opus as well. On the same hardware.

Namely, Amazon Bedrock and Google Vertex.

That means normalized infrastructure costs, normalized electricity costs, and normalized hardware performance. Normalized inference software stack, even (most likely). It's about a close of a 1 to 1 comparison as you can get.

Both Amazon and Google serve Opus at roughly ~1/2 the speed of the chinese models. Note that they are not incentivized to slow down the serving of Opus or the chinese models! So that tells you the ratio of active params for Opus and for the chinese models.

show 4 replies
fennecfoxytoday at 9:55 AM

I mean GN has covered the Nvidia black market in China enough that we pretty much know that they run on Nvidia hardware still.

show 1 reply