OpenAI and Broadcom unveil LLM-optimized inference chip

87 points • by meetpateltech • today at 1:14 PM • 29 comments • view on HN

Comments

shellcromancer • today at 2:36 PM

Probably obvious but still omitted in the OpenAI post: chips are being made by TSMC [1]. Wasn't sure if Intel got it.

1. https://www.investing.com/news/stock-market-news/openai-unve...

➕ show 2 replies

Pretty huge move. Google and their TPUs are looking infinitely more prescient as I think they are on their 7th generation, along with the offshoots it inspired like the LPU and even others, perhaps like Cerebras and their Wafer Scale Engine.

However, based off first impressions, it seems like this is meant for inference side, and not training, which is also an interesting choice.

➕ show 3 replies

theowaway213456 • today at 4:21 PM

This seems like more competition for Cerebras? Am I understanding correctly?

kilroy123 • today at 2:32 PM

I hope to see something like this, but in a small form factor like the NVIDIA spark.

I want a super fast LLM that is Opus 4.6+, like, in ability.

➕ show 1 reply

v5v3 • today at 4:04 PM

>designed for initial deployment by the end of 2026 and expanding in the years ahead,

So after the IPO and will be featured heavily in the IPO sales brochure as a future promise?

I'm sceptical over any pre-IPO announcements.

satvikpendem • today at 3:58 PM

I'm assuming they used LLMs to (help humans) do custom circuit design. Even pre LLM there were various computer optimizations that didn't require humans like genetic algorithms. It'd be cool to see a paper on how they did it.

Legend2440 • today at 3:51 PM

The only surprising thing about this is that they didn't do it three years ago.

gravypod • today at 4:02 PM

I wonder how close OpenAI is getting to using the memory they purchased. Are they planning to stack a huge amount of HBM2 into these chips?

➕ show 1 reply

dadoum • today at 3:19 PM

> May we scale smoothly, exponentially and uneventfully through A[SI]

That sentence sounds weird to me. I can't really put my finger on why, maybe the combination of adverbs, or just the fact of writing the desire of scaling as a company so directly. It feels (to me) like openly claiming their selfish goals. Or maybe I am just misinterpreting and they are referring to the whole humanity as "We" (but knowing Broadcom and in a lesser extent OpenAI doings, I am not convinced).

fennecbutt • today at 4:12 PM

I mean I'd love to be able to buy something like the 17k tps taalas chip as a pcie or m.2.

Imagine when we can roar along at that speed, low power. Can just have the model reason for a while about anything and everything. It reminds me of the "race to idle" for mcus etc.

➕ show 2 replies

qsxfthnkp2322 • today at 3:28 PM

aw shucks nvda has some spicy competition

Make sure you all use that fancy ñ

➕ show 1 reply

jabedude • today at 3:58 PM

how much does this chip help with inference speed?

➕ show 1 reply

fibonacci112358 • today at 2:32 PM

So this is where all the memory they bought is going to.

➕ show 1 reply

jerojero • today at 2:56 PM

One thing I don't like about California based companies is how cringe the names always are.

"Jalapeño" is such a bad name, having an "ñ" already makes it difficult and annoying to deal with in so many little ways. Good luck with that.

But also, theres the sort of "yes lets use Mexican related things because we're California" thought that I just really hate. I don't know, its like corporate Memphis to me. You see a product like this, you know it's an uppity califonia based firm that came up with it.

➕ show 3 replies

alt Hacker News

OpenAI and Broadcom unveil LLM-optimized inference chip

Comments