I think what everyone underestimated was the absolute bonkers amount of compute it will take and how...

_fat_santa • today at 5:22 PM • 12 replies • view on HN

I think what everyone underestimated was the absolute bonkers amount of compute it will take and how that compute must scale in order to keep up with larger and larger models.

Replies

darth_avocado • today at 6:09 PM

More than that, I think people overestimate how much AI will progress as you throw more compute at it. It’s the “9 women can’t deliver a baby in a month” equivalent of AI. Additional compute won’t magically give you AGI.

➕ show 1 reply

PaulHoule • today at 6:02 PM

I was involved in three efforts to commercialize foundation models before they were ready in the 2010s so I have a good picture of how progress works at this sort of thing and the pace a lot of the industry has been talking about is unrealistic: like people were disappointed with the rate of development of Apple Intelligence but it's actually progressed at about the rate I expected.

➕ show 2 replies

jalev • today at 5:56 PM

Is that a problem for Meta though? They recently announced they're going to sell their excess compute, so I imagine the actual problem is they're resorting to doing that because AI isn't having nearly the effect/usage it was supposed to and now Zuck is being a sore winner about it

➕ show 3 replies

ralphington • today at 6:37 PM

It will scale inefficiently until efficiency breakthroughs occur, but it's really hard to predict when those breakthroughs will happen. Plan on the worst, but be ready and capable of capitalizing when it happens!

0xcafefood • today at 5:53 PM

That seems like such an easy thing to estimate with a bit of basic napkin math.

➕ show 1 reply

isityettime • today at 5:51 PM

I thought thats exactly what everyone anticipates? "Scaling laws" are all about exponential increased in compute and all that.

MattDamonSpace • today at 6:30 PM

Altman was trying to get $1T of infra investment years ago

dofm • today at 5:51 PM

And yet this doesn't turn out to be Meta's problem at all.

https://uk.pcmag.com/ai/165970/meta-exploring-option-to-sell...

Meta bought too many GPUs, has spare GPU capacity and they are exploring renting that capacity out.

The problem is not that the models need too much to do the job. If that were the case, Meta would not have spare capacity.

The problem is that the models currently can't be made to do the job.

➕ show 1 reply

maccard • today at 6:05 PM

Did we? Many of us have been saying that the amount of compute going into the models is unsustainable and that the models aren’t improving enough to justify that for over a year. The emperor has no clothes is true yet again.

teeray • today at 6:03 PM

They also believed they would be able to build that compute without restrictions. Between hardware costs and massive public opposition, scaling as they had anticipated is in jeopardy.

skeledrew • today at 6:30 PM

Bonkers compute only in the beginning. Over time it'll reduce as models are made more efficient.

➕ show 1 reply

simianwords • today at 6:08 PM

No I don't think there was any systemic underestimation of compute. I see the opposite - every company understands compute is important and tries to get hold of it.

alt Hacker News

Replies