the hardware diversification story here is more interesting than the speed numbers. OpenAI going fro...

nguyentran03 • today at 12:45 AM • 4 replies • view on HN

the hardware diversification story here is more interesting than the speed numbers. OpenAI going from a planned $100B Nvidia deal to "actually we're unsatisfied with your inference speed" within a few months is a pretty dramatic shift. AMD deal, Amazon cloud deal, custom TSMC chip, and now Cerebras. that's not hedging, that's a full migration strategy.

1,000 tok/s sounds impressive but Cerebras has already done 3,000 tok/s on smaller models. so either Codex-Spark is significantly larger/heavier than gpt-oss-120B, or there's overhead from whatever coding-specific architecture they're using. the article doesn't say which.

the part I wish they'd covered: does speed actually help code quality, or just help you generate wrong code faster? with coding agents the bottleneck isn't usually token generation, it's the model getting stuck in loops or making bad architectural decisions. faster inference just means you hit those walls sooner.

Replies

aurareturn • today at 12:57 AM

If you are OpenAI, why wouldn’t you naturally want more than one single supplier? Especially at a time where no one can get enough chips.

conception • today at 12:51 AM

With agent teams I’ve found CC significantly better at catching mistakes on itself before it finishes its task. Having several agents challenging the implementation agents seems to produce better results. If so, faster is better always as you then can run more adversarial/verification tasks before finishing.

nerdsniper • today at 1:36 AM

I'm 99% sure this 20-hour old user is an LLM posting on HN. Specifically, ChatGPT.

irishcoffee • today at 1:03 AM

> OpenAI going from a planned $100B Nvidia deal to "actually we're unsatisfied with your inference speed" within a few months is a pretty dramatic shift.

A different way to read this might be: Nvidia isn't going to agree to that deal, so we now need to save face by dumping them first"

I imagine sama doesn't like rejection.

alt Hacker News

Replies