logoalt Hacker News

_fat_santatoday at 5:22 PM12 repliesview on HN

I think what everyone underestimated was the absolute bonkers amount of compute it will take and how that compute must scale in order to keep up with larger and larger models.


Replies

darth_avocadotoday at 6:09 PM

More than that, I think people overestimate how much AI will progress as you throw more compute at it. It’s the “9 women can’t deliver a baby in a month” equivalent of AI. Additional compute won’t magically give you AGI.

show 1 reply
PaulHouletoday at 6:02 PM

I was involved in three efforts to commercialize foundation models before they were ready in the 2010s so I have a good picture of how progress works at this sort of thing and the pace a lot of the industry has been talking about is unrealistic: like people were disappointed with the rate of development of Apple Intelligence but it's actually progressed at about the rate I expected.

show 2 replies
jalevtoday at 5:56 PM

Is that a problem for Meta though? They recently announced they're going to sell their excess compute, so I imagine the actual problem is they're resorting to doing that because AI isn't having nearly the effect/usage it was supposed to and now Zuck is being a sore winner about it

show 3 replies
ralphingtontoday at 6:37 PM

It will scale inefficiently until efficiency breakthroughs occur, but it's really hard to predict when those breakthroughs will happen. Plan on the worst, but be ready and capable of capitalizing when it happens!

0xcafefoodtoday at 5:53 PM

That seems like such an easy thing to estimate with a bit of basic napkin math.

show 1 reply
isityettimetoday at 5:51 PM

I thought thats exactly what everyone anticipates? "Scaling laws" are all about exponential increased in compute and all that.

MattDamonSpacetoday at 6:30 PM

Altman was trying to get $1T of infra investment years ago

dofmtoday at 5:51 PM

And yet this doesn't turn out to be Meta's problem at all.

https://uk.pcmag.com/ai/165970/meta-exploring-option-to-sell...

Meta bought too many GPUs, has spare GPU capacity and they are exploring renting that capacity out.

The problem is not that the models need too much to do the job. If that were the case, Meta would not have spare capacity.

The problem is that the models currently can't be made to do the job.

show 1 reply
maccardtoday at 6:05 PM

Did we? Many of us have been saying that the amount of compute going into the models is unsustainable and that the models aren’t improving enough to justify that for over a year. The emperor has no clothes is true yet again.

teeraytoday at 6:03 PM

They also believed they would be able to build that compute without restrictions. Between hardware costs and massive public opposition, scaling as they had anticipated is in jeopardy.

skeledrewtoday at 6:30 PM

Bonkers compute only in the beginning. Over time it'll reduce as models are made more efficient.

show 1 reply
simianwordstoday at 6:08 PM

No I don't think there was any systemic underestimation of compute. I see the opposite - every company understands compute is important and tries to get hold of it.