> There are certain tasks, like improving a given program for speed, for instance, where in theor...

piker • yesterday at 10:54 AM • 3 replies • view on HN

> There are certain tasks, like improving a given program for speed, for instance, where in theory the model can continue to make progress with a very clear reward signal for a very long time.

Super skeptical of this claim. Yes, if I have some toy poorly optimized python example or maybe a sorting algorithm in ASM, but this won’t work in any non-trivial case. My intuition is that the LLM will spin its wheels at a local minimum the performance of which is overdetermined by millions of black-box optimizations in the interpreter or compiler signal from which is not fed back to the LLM.

Replies

NitpickLawyer • yesterday at 5:08 PM

> but this won’t work in any non-trivial case

Earlier this year google shared that one of their projects (I think it was alphaevolve) found an optimisation in their stack that sped up their real world training runs by 1%. As we're talking about google here, we can be pretty sure it wasn't some trivial python trick that they missed. Anyhow, at ~100M$ / training run, that's a 1M$ save right there. Each and every time they run a training run!

And in the past month google also shared another "agentic" workflow where they had gemini2.5-fhash! (their previous gen "small" model) work autonomously on migrating codebases to support aarch64 architecture. There they found ~30% of the projects worked flawlessly end-to-end. Whatever costs they save from switching to ARM will translate in real-world $ saved (at google scale, those can add up quickly).

➕ show 2 replies

andy99 • yesterday at 11:00 AM

There was a discussion the other day where someone asked Claude to improve a code base 200x https://news.ycombinator.com/item?id=46197930

➕ show 1 reply

dist-epoch • yesterday at 10:59 AM

https://github.com/algorithmicsuperintelligence/openevolve

➕ show 1 reply

alt Hacker News

Replies