logoalt Hacker News

Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster

47 pointsby hopechongtoday at 4:55 PM18 commentsview on HN

Comments

kraddypattiestoday at 5:40 PM

I feel like most of this recent Autoresearch trend boils down to reinventing hyper-parameter tuning. Is the SOTA still Bayesian optimization when given a small cluster? It was ~3 years ago when I was doing this kind of work, haven't kept up since then.

Also, shoutout SkyPilot! It's been a huge help for going multi-cloud with our training and inference jobs (getting GPUs is still a nightmare...)!

show 3 replies
zhwutoday at 6:00 PM

The most surprising part: the agent had access to both H100s and H200s. Without being told, it noticed H200s scored better and started screening ideas on H100s, then promoting winners to H200s for validation. That strategy emerged entirely on its own.

show 4 replies
fabmilotoday at 7:01 PM

I am fascinated by this example of using AI to improve AI. I won a small prize using this technique on helion kernels at a pytorch hackathon in SF.

The next step are: - give the agent the whole deep learning literature research and do tree search over the various ideas that have been proposed in the past. - have some distributed notepad that any of these agents can read and improve upon.

ipsum2today at 6:25 PM

A cluster is 2 nodes? That's technically true, but not very exciting.

covitoday at 6:00 PM

This feels like the chimpanzee with a power drill. An agent is honestly just brute-force search, but guided.

show 3 replies
pratelsinghtoday at 6:17 PM

[dead]