logoalt Hacker News

HarHarVeryFunnyyesterday at 3:54 AM0 repliesview on HN

The entire history of RL-trained "reasoning models" from o1 to DeepSeek_R1 is basically just a year old!