In their paper, they explain that "in the case of math problems with deterministic results, the...

boole1854 • 01/21/2025 • 0 replies • view on HN

In their paper, they explain that "in the case of math problems with deterministic results, the model is required to provide the final answer in a specified format (e.g., within a box), enabling reliable rule-based verification of correctness. Similarly, for LeetCode problems, a compiler can be used to generate feedback based on predefined test cases."

Basically, they have an external source-of-truth that verifies whether the model's answers are correct or not.

alt Hacker News