logoalt Hacker News

zozbot234yesterday at 11:09 PM2 repliesview on HN

> We will run out of additional material to train on

This sounds a bit silly. More training will generally result in better modeling, even for a fixed amount of genuine original data. At current model sizes, it's essentially impossible to overfit to the training data so there's no reason why we should just "stop".


Replies

_0ffhtoday at 12:51 AM

You'd be surprised how quickly improvement of autoregressive language models levels off with epoch count (though, admittedly, one epoch is a LOT). Diffusion language models otoh indeed keep profiting for much longer, fwiw.

show 1 reply
pvab3yesterday at 11:31 PM

I'm just talking about text generated by human beings. You can keep retraining with more parameters on the same corpus

https://proceedings.mlr.press/v235/villalobos24a.html

show 1 reply