logoalt Hacker News

0xbadcafebeelast Friday at 2:13 PM1 replyview on HN

This isn't the first time they've dealt with scarcity, there's been supply chain scarcity four times since 2000. Post-dotcom boom, CDMA scarcity, HDD/flash scarcity, Pandemic scarcity.

The scarcity isn't long-term. Like all manufactured products, they'll ramp up production and flood the market with hardware, people will buy too much, market will drop. Boom and bust.

We're also still in the bubble. Eventually markets will no longer bear the lack of productivity/profit (as AI isn't really that useful) and there will be divestment and more hardware on the market as companies implode. Nobody is making 10x more from AI, they are just investing in it hoping for those profits which so far I don't think anyone has seen, other than in the companies selling the AI to other companies.

But more importantly, the models and inference keeps getting more efficient, so less hardware will do more in the future. We already have multiple models good enough for on-device small-scale work. In 5 years consumer chips and model inference will be so good you won't need a server for SOTA. When that happens, most of the billions invested in SOTA companies will disappear overnight, which'll leave a sizeable hole in the market.


Replies

topherhuntlast Friday at 8:03 PM

> In 5 years consumer chips and model inference will be so good you won't need a server for SOTA.

Naw man, you crazy. If you tell me that in 5 years, consumer chips will be so good that I can run GPT-5.4-level AI on my phone, I'd find that plausible (I buy cheap phones). If you're telling me that in 5 years we won't need _servers_ because our _phones and/or desktops_ will be powerful enough to run the biggest newest LLMs in existence, I question your judgment, I think that prediction shows a deep uncreativity about how massively compute-hungry SOTA models will get.

The valuable things to do with inference will keep being a server niche because they'll keep being 1-2 OOM more compute-hungry than whatever consumer hardware can handle. Like gaming: my laptop can run games from 2015 at max settings no problem but the games actually worth getting excited about in 2026 still melt a $2k GPU, because whatever headroom the hardware gains, developers immediately spend on ray tracing and Nanite and modelling individual skin cells or whatever. I don't see any plausible reason to expect that the ceiling on "valuable server-side compute" or "inference capacity" will rise any more slowly than the on-device capability is rising.

My assumption is that in 2031, SOTA top-intelligence AI will be hosted on cloud servers like it is today, offering dirt-cheap access to capabilities we can't even dream of today, while your Android will be running some open-source GPT-5+ equivalent.

show 1 reply