The paper is here - https://arxiv.org/pdf/2603.19461
This, IMO is the biggest insight into where we're at and where we're going:
> Because both evaluation and self-modification are coding tasks, gains in coding ability can translate into gains in self-improvement ability.
There's a thing that I've noticed early into LLMs: once they unlock one capability, you can use that capability to compose stuff and improve on other, related or not, capabilities. For example "reflexion" goes into coding - hey, this didn't work, let me try ... Then "tools". Then "reflxion" + "tools". And so on.
You can get workflows that have individual parts that aren't so precise become better by composing them, and letting one component influence the other. Like e2e coding gets better by checking with "gof" tools (linters, compilers, etc). Then it gets even better by adding a coding review stage. Then it gets even better by adding a static analysis phase.
Now we're seeing this all converge on "self improving" by combining "improving" components. And so on. This is really cool.
Agree. It's code all the way down. The key is to give agents a substrate where they can code up new capabilities and then compose them meaningfully and safely.
Larger composition, though, starts to run into typical software design problems, like dependency graphs, shared state, how to upgrade, etc.
I've been working on this front for over two years now too: https://github.com/smartcomputer-ai/agent-os/
I guess this paper is part of ICML coming soon this June. I hope to see a lot of cool papers.
>You can get workflows that have individual parts that aren't so precise become better by composing them, and letting one component influence the other. Like e2e coding gets better by checking with "gof" tools (linters, compilers, etc). Then it gets even better by adding a coding review stage. Then it gets even better by adding a static analysis phase.
This is the exact point I make whenever people say LLMs aren't deterministic and therefore not useful.
Yes, they are "stochastic". But you can use them to write deterministic tools that create machine readable output that the LLM can use. As you mention, you keep building more of these tools and tying them together and then you have a deterministic "network" of "lego blocks" that you can run repeatably.
The whole theme of llm dev to date has been "theres more common than not" in llm applications
IF they are self modifying. Is there also a big risk, that they cause a bug, to dumb themselves down, break themselves. How do they get back? Are they able to restore a backup of themselves, if a self modification is bad.
Or, are there two. One is modifying the other, observing results, before self applying.
Agents need the ability to code but also to objectively and accurately evaluate whether changes resulted in real improvements. This requires skills with metrics and statistics. If they can make those reliable then self-improvement is basically assured, on a long enough timeline.
because submarine piloting is a going-under-water activity, improvements in holding one's breath can lead to faster submersibles.
Im sorry, this just sounds like hypespeak. CAn you provide samples?
> once they unlock one capability,
What does it mean to unlock? Its an llm nothing is locked. The output is a as good as the context, model and environment. Nothing is hidden or locked.
I disagree that evaluation is always a coding task. Evaluation is scrutiny for the person who wants the thing. It’s subjective. So, unless you’re evaluating something purely objective, such as an algorithm, I don’t see how a self contained, self “improving “ agent accomplishes the subjectivity constraint - as by design you are leaving out the subject.