logoalt Hacker News

Workaccount210/11/20243 repliesview on HN

>if your methodology is flawed you aren't gonna get any more significant returns.

The problem with this statement is that predictions made about scaling 5 years ago have held true[1]. We keep adding parameters, adding compute, and the models keep getting more capable.

The flaws of LLM's from 2024 are not what is relevant. Just like the flaws of LLMs from 2021 were not relevant. What is relevant is the rate of change, and the lack of evidence that things won't continue on this steep incline. Especially if you consider that GPT4 was sort of a preview model that motivated big money to make ungodly investments to see how far we can push this. Those models will start to show up over the next 2 years.

If they break the trend and the scaling flops, then I think a lot of air is gonna blow out of the bubble.

[1]https://arxiv.org/pdf/2001.08361


Replies

vrighter10/11/2024

we added a lot of parameters.

We added a LOT of data.

The resulting models have become only slightly better. And they still have all of their old problems.

I think this is proof that scaling doesn't work. It's not like we just doubled the sizes, they increased by a lot, but improvements are less and less each time. And they've already run out of useful data.

dev1ycan10/11/2024

They are very literally asking for trillions and even nuclear powered data centers, pretty sure we've gotten to the point where it's not sustainable.

show 1 reply
yoav_hollander10/12/2024

Exactly. I was assuming that the by now the default answer to "LLMs sort-of do this, but not very well" should be "OK, wait a few months".