logoalt Hacker News

zamalekyesterday at 8:31 PM1 replyview on HN

> with AI-generated content excluded from pre-training.

Though this is largely impossible these days, unless they pre-trained on pre-AI era data.


Replies

stymaaryesterday at 10:16 PM

That could be. Just use pre-training for language understanding and let the post-training on synthetic data do the heavy lifting.