logoalt Hacker News

aspenmartintoday at 3:45 PM0 repliesview on HN

This is just not true. Maybe it will be true if you increase the problem difficulty in concert with model performance? You don't need fine tuning for this and you haven't for years now. Reasoning performance for now may be SOMEWHAT brittle but again look at where we have come from in like 2 years. Then also consider the logical next steps

- better context compression (already happening) + memory solutions that extend the effective context length [memory _is_ compression]

- continual learning systems (likely already prototyped)

- these domains are _verifiable_ which I think just seems to confuse people. RL in verifiable domains takes you farther and farther. Training data is a bootstrap to get to a starting point, because RL from scratch is too inefficient.

agents can already deal with large codebases and datasets, just like any SWE, DS or researcher.

and yes! If you throw more compute at a problem you will get better solutions! But you are missing the point: for the frontier solutions, which changes with every model update, you of course need to eek out as much performance as you can, which requires a large amount of test time compute. But what you can do _without_ this is continually improving. The pattern _already in place_ is that at first you need an extreme amount of compute, then the next model iterations need far less compute to reach that same solution, etc etc. The costs + compute requirements to perform a particular task decrease exponentially.