logoalt Hacker News

aspenmartinlast Friday at 10:36 PM3 repliesview on HN

I really wish more people skeptical of AI capabilities would read about scaling laws -- Lilian is always so marvelous at giving a deep overview of the technical side but the whole point of this is: there are scaling laws, and they hold and continue to hold. This is such a huge basis for the predictions about AI capabilities for the past like 5 years.


Replies

an0maloustoday at 2:45 AM

Why should the skeptics be reading it? The scaling laws show diminishing returns on more training data and larger models.

From the Kaplan scaling laws paper:

> We have observed consistent scalings of language model log-likelihood loss with non-embedding parameter count N, dataset size D, and optimized training computation Cmin, as encapsulated in Equations (1.5) and (1.6). Conversely, we find very weak dependence on many architectural and optimization hyperparameters. Since scalings with N,D,Cmin are power-laws, there are diminishing returns with increasing scale.

So the skeptics are right to be skeptical of LLMs being all you need for continued advancement in this space. It seems like the believers are the ones who need to learn about the scaling laws.

FromTheFirstInlast Friday at 11:42 PM

And sitting right next to the data and compute factors in every cross entropy loss equation is the entropy of the language, which is just a fixed constant. There’s such a hard cap on cross entropy loss training and I never hear it come up!

show 1 reply
openclawclubtoday at 7:23 AM

[flagged]