logoalt Hacker News

sdenton4yesterday at 7:23 PM1 replyview on HN

For raw hyperparameter search, though, I would expect a proper Bayesian framework to be much better. Eg, vizier.


Replies

ainchyesterday at 8:09 PM

I think it depends whether you can leverage some knowledge. It's possible for a person/LLM to look at a loss curve and say "oh that's undertraining, let's bump the lr" - whereas a Bayesian method doesn't necessarily have deeper understanding, so it'll waste a lot of time exploring the search space on poor options.

If you're resource unconstrained then BO should ofc do very well though.

show 1 reply