logoalt Hacker News

rafaelmntoday at 9:15 AM0 repliesview on HN

I think I've heard multiple time that a large % of training compute for SoTA models is inference to generate training tokens, this is bound to happen with RL training