logoalt Hacker News

kraddypattiestoday at 5:40 PM3 repliesview on HN

I feel like most of this recent Autoresearch trend boils down to reinventing hyper-parameter tuning. Is the SOTA still Bayesian optimization when given a small cluster? It was ~3 years ago when I was doing this kind of work, haven't kept up since then.

Also, shoutout SkyPilot! It's been a huge help for going multi-cloud with our training and inference jobs (getting GPUs is still a nightmare...)!


Replies

karpathytoday at 6:51 PM

Wrong and short-sighted take given that the LLM explores serially learning along the way, and can tool use and change code arbitrarily. It seems to currently default to something resembling hyperparameter tuning in absence of more specific instructions. I briefly considered calling the project “autotune” at first but I think “autoresearch” will prove to be the significantly more appropriate name.

show 6 replies
ipsum2today at 6:30 PM

Hyperparam tuning that has better intuition and can incorporate architecture changes automatically. It won't invent something completely new though.

show 1 reply