logoalt Hacker News

visargatoday at 6:59 PM0 repliesview on HN

Scaling the model size, the compute and the dataset hit a wall. Too large a model, or if it needs too much compute - it becomes too expensive to use. And the dataset .. we benefitted in one go from about multiple decades of content accumulation online, but since late 2022 it's only been 3 years, so organic text does not increase exponentially past this size, it only worked for 50T tokens or so.