logoalt Hacker News

Sharlin05/15/20251 replyview on HN

Proving the correctness of the “improvements” is another thing entirely, though.


Replies

NitpickLawyer05/15/2025

I agree. At first the problems that you try to solve need to be verifiable.

But there's progress on many fronts on this. There's been increased interest in provers (natural language to lean for example). There's also been progress in LLM-as-a-judge on open-ish problems. And it seems that RL can help with extracting step rewards from sparse rewards domains.