Autoresearch is nothing new, big players are already in the game with more sophisticated solutions:
- https://arxiv.org/abs/2602.02660 (MARS)
- https://arxiv.org/abs/2601.14525 (Execution-grounded automated AI research)
- https://arxiv.org/abs/2601.10402 (ML-Master 2.0)
The mostly used benchmark for automated AI engineering/ research is:
https://github.com/openai/mle-bench
The thing is, autoresearch feels more accessible that the listed solutions. I can use it trivially on virtually any problem that has verifiable rewards and a feedback loop.