logoalt Hacker News

agoodusername63yesterday at 6:50 PM1 replyview on HN

Conversations about the costs of inference never consider the reality that API pricing is significantly higher than the operating costs.

Nor do they ever consider that the cost of datacenter hosted inference has to crash when the bubble pops and hardware vendors can't fill orders at sky high prices created by demand anymore and the hyperscalers can't keep things running near capacity at the high demand prices.

All of which leads to the ROI math for implementing AI looking much different.

Has everybody forgotten how much money Nvidia, TSMC, and all the hyperscalers are making, today, in pure profit? The costs of inference are high because we're in a bubble.


Replies

bwestergardyesterday at 6:59 PM

I think many of these problems still arise if inference is effectively free in monetary terms to the end user. In many economic processes, time to getting the final and correct answer is the major driver of profitability.