logoalt Hacker News

nolist_policytoday at 7:02 PM0 repliesview on HN

Is distillation or synthetic data used during pre-training? If yes how much?