logoalt Hacker News

justinl3301/20/20251 replyview on HN

> This is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.

This is a noteworthy achievement.


Replies

throwaway31415501/21/2025

Excuse my ignorance. What does SFT refer to here?

show 1 reply