> We will run out of additional material to train on This sounds a bit silly. More training wil...

zozbot234 • yesterday at 11:09 PM • 2 replies • view on HN

> We will run out of additional material to train on

This sounds a bit silly. More training will generally result in better modeling, even for a fixed amount of genuine original data. At current model sizes, it's essentially impossible to overfit to the training data so there's no reason why we should just "stop".

Replies

_0ffh • today at 12:51 AM

You'd be surprised how quickly improvement of autoregressive language models levels off with epoch count (though, admittedly, one epoch is a LOT). Diffusion language models otoh indeed keep profiting for much longer, fwiw.

➕ show 1 reply

pvab3 • yesterday at 11:31 PM

I'm just talking about text generated by human beings. You can keep retraining with more parameters on the same corpus

https://proceedings.mlr.press/v235/villalobos24a.html

➕ show 1 reply

alt Hacker News

Replies