logoalt Hacker News

oezilast Sunday at 8:45 PM2 repliesview on HN

One question I was wondering about regarding the open models released by big labs is how much more the could improve with additional training. GPT-OSS has 2.1m hours of training, how much score improvements could we see at double that?


Replies

ModelForgelast Sunday at 9:52 PM

I think GPT-4.5 was potentially the original GPT-5 model that was larger and pre-trained on more data. Too bad it was too expensive to deploy at scale so that we never saw the RL-ed version

poormanlast Sunday at 8:52 PM

As we saw with GPT-5 the RL technique of training doesn't scale forever

show 2 replies