logoalt Hacker News

marcyb5sttoday at 7:18 AM0 repliesview on HN

Yeah, big lol on the Recursive Self-Improvement.

I mean, firstly you get to have an "Agent" actually capable of really long horizon tasks without getting stuck in tools loops and having its context rot. Secondly, each trial (ie a model fully trained to convergence) costs millions and takes O(weeks). You can probably run 1 or 2 of those experiments in parallel even at big AI labs as the hardware is scarce and they are costly as mentioned before. Assuming this agent needs something like a hundred tries to just show some improvement, we are looking at years.

And you can't early stop training for candidates that are not promising due to "emerging capabilities". At some point you might get a big drop in loss even if the model has been plateuing for a while. And you can't really scale down models for running trials quickly either also due to these emerging capabilities. In fact you might create a model that is great at converging with in small trials (fewer params, fewer tokens), but that is uncapable of developing those unexpected traits. And this will likely happen as you created a sort of evolutionary pressure in this direction: if you are good at learning in the first few epochs you "survive" and get to pass down your traits to future trials.

All of this to say that recursive-self-improvement is waaaaaay out of our grasp as things stand right now. We need another one or two breakthroughs to get there (IMHO).