logoalt Hacker News

cs702yesterday at 11:40 PM0 repliesview on HN

A superior alternative to standard Muon and AdamW optimizers for training large models.

Fantastic work, instantly valuable, immediately usable.

A big THANK YOU to the authors:

Jack Zhang, Noah Amsel, Berlin Chen, and Tri Dao