> This is the first open research to validate that reasoning capabilities of LLMs can be incentiv...

justinl33 • 01/20/2025 • 1 reply • view on HN

> This is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.

This is a noteworthy achievement.

throwaway314155 • 01/21/2025

Excuse my ignorance. What does SFT refer to here?

➕ show 1 reply

alt Hacker News