logoalt Hacker News

HeavyStormtoday at 11:57 AM1 replyview on HN

There's no "just" in RL. Fine tuning is very important and could make a lot of difference.


Replies

merlindrutoday at 1:33 PM

apparently GPT-5 uses the same pretrain as 4o did, hah