There's no "just" in RL. Fine tuning is very important and could make a lot of differ...

HeavyStorm • today at 11:57 AM • 1 reply • view on HN

There's no "just" in RL. Fine tuning is very important and could make a lot of difference.

merlindru • today at 1:33 PM

apparently GPT-5 uses the same pretrain as 4o did, hah

alt Hacker News