I am of the opinion that Nvidia's hit the wall with their current architecture in the same way ...

roughly • today at 1:30 AM • 8 replies • view on HN

I am of the opinion that Nvidia's hit the wall with their current architecture in the same way that Intel has historically with its various architectures - their current generation's power and cooling requirements are requiring the construction of entirely new datacenters with different architectures, which is going to blow out the economics on inference (GPU + datacenter + power plant + nuclear fusion research division + lobbying for datacenter land + water rights + ...).

The story with Intel around these times was usually that AMD or Cyrix or ARM or Apple or someone else would come around with a new architecture that was a clear generation jump past Intel's, and most importantly seemed to break the thermal and power ceilings of the Intel generation (at which point Intel typically fired their chip design group, hired everyone from AMD or whoever, and came out with Core or whatever). Nvidia effectively has no competition, or hasn't had any - nobody's actually broken the CUDA moat, so neither Intel nor AMD nor anyone else is really competing for the datacenter space, so they haven't faced any actual competitive pressure against things like power draws in the multi-kilowatt range for the Blackwells.

The reason this matters is that LLMs are incredibly nifty often useful tools that are not AGI and also seem to be hitting a scaling wall, and the only way to make the economics of, eg, a Blackwell-powered datacenter make sense is to assume that the entire economy is going to be running on it, as opposed to some useful tools and some improved interfaces. Otherwise, the investment numbers just don't make sense - the gap between what we see on the ground of how LLMs are used and the real but limited value add they can provide and the actual full cost of providing that service with a brand new single-purpose "AI datacenter" is just too great.

So this is a press release, but any time I see something that looks like an actual new hardware architecture for inference, and especially one that doesn't require building a new building or solving nuclear fusion, I'll take it as a good sign. I like LLMs, I've gotten a lot of value out of them, but nothing about the industry's finances add up right now.

Replies

nl • today at 2:02 AM

> I am of the opinion that Nvidia's hit the wall with their current architecture

Based on what?

Their measured performance on things people care about keep going up, and their software stack keeps getting better and unlocking more performance on existing hardware

Inference tests: https://inferencemax.semianalysis.com/

Training tests: https://www.lightly.ai/blog/nvidia-b200-vs-h100

https://newsletter.semianalysis.com/p/mi300x-vs-h100-vs-h200... (only H100, but vs AMD)

> but nothing about the industry's finances add up right now

Is that based just on the HN "it is lots of money so it can't possibly make sense" wisdom? Because the released numbers seem to indicate that inference providers and Anthropic are doing pretty well, and that OpenAI is really only losing money on inference because of the free ChatGPT usage.

Further, I'm sure most people heard the mention of an unnamed enterprise paying Anthropic $5000/month per developer on inference(!!) If a company if that cost insensitive is there any reason why Anthropic would bother to subsidize them?

➕ show 2 replies

linuxftw • today at 3:00 AM

Based on conversations I've had with some people managing GPU's at scale in the datacenters, inference is an after thought. There is a gold rush for training right now, and that's where these massive clusters are being used.

LLM's are probably a small fraction of the overall GPU compute in use right now. I suspect in the next 5 years we'll have full Hollywood movies being completely generated (at least the specialfx) entirely by AI.

segmondy • today at 1:55 AM

> The reason this matters is that LLMs are incredibly nifty often useful tools that are not AGI and also seem to be hitting a scaling wall

I don't know who needs to hear this, but the real break through in AI that we have had is not LLMs, but generative AI. LLM is but one specific case. Furthermore, we have hit absolutely no walls. Go download a model from Jan 2024, another from Jan 2025 and one from this year and compare. The difference is exponential in how well they have gotten.

➕ show 4 replies

kuil009 • today at 1:32 AM

Thanks for this. It put into words a lot of the discomfort I’ve had with the current AI economics.

re-thc • today at 2:01 AM

> I am of the opinion that Nvidia's hit the wall with their current architecture

Not likely since TSMC has a new process with big gains.

> The story with Intel

Was that their fab couldn’t keep up not designs.

➕ show 1 reply

flyinglizard • today at 1:51 AM

You’re right but Nvidia enjoys an important advantage Intel had always used to mask their sloppy design work: the supply chain. You simply can’t source HBMs at scale because Nvidia bought everything, TSMC N3 is likewise fully booked and between Apple and Nvidia their 18A is probably already far gone and if you want to connect your artisanal inference hardware together then congratulations, Nvidia is the leader here too and you WILL buy their switches.

As for the business side, I’ve yet to hear of a transformative business outcome due to LLMs (it will come, but not there yet). It’s only the guys selling the shovels that are making money.

This entire market runs on sovereign funds and cyclical investing. It’s crazy.

bigyabai • today at 1:35 AM

> but nothing about the industry's finances add up right now.

The acquisitions do. Remember Groq?

➕ show 1 reply

petesergeant • today at 1:43 AM

> nothing about the industry's finances add up right now

Nothing about the industry’s finances, or about Anthropic and OpenAI’s finances?

I look at the list of providers on OpenRouter for open models, and I don’t believe all of them are losing money. FWIW Anthropic claims (iirc) that they don’t lose money on inference. So I don’t think the industry or the model of selling inference is what’s in trouble there.

I am much more skeptical of Anthropic and OpenAI’s business model of spending gigantic sums on generating proprietary models. Latest Claude and GPT are very very good, but not better enough than the competition to justify the cash spend. It feels unlikely that anyone is gonna “winner takes all” the market at this point. I don’t see how Anthropic or OpenAI’s business model survive as independent entities, or how current owners don’t take a gigantic haircut, other than by Sam Altman managing to do something insane like reverse acquiring Oracle.

EDIT: also feels like Musk has shown how shallow the moat is. With enough cash and access to exceptional engineers, you can magic a frontier model out of the ether, however much of a douche you are.

alt Hacker News

Replies