That's essentially what R1 Zero is showing: > Notably, it is the first open research to va...

throwaway4aday • 01/22/2025 • 0 replies • view on HN

That's essentially what R1 Zero is showing:

> Notably, it is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT.

alt Hacker News