logoalt Hacker News

behnamohtoday at 5:54 PM1 replyview on HN

The same author thought there would be no scaling walls: https://stochasm.blog/posts/scaling_post/


Replies

visargatoday at 6:59 PM

Scaling the model size, the compute and the dataset hit a wall. Too large a model, or if it needs too much compute - it becomes too expensive to use. And the dataset .. we benefitted in one go from about multiple decades of content accumulation online, but since late 2022 it's only been 3 years, so organic text does not increase exponentially past this size, it only worked for 50T tokens or so.