logoalt Hacker News

cubefoxtoday at 8:38 AM1 replyview on HN

They don't say anything about dropping training speed.


Replies

estearumtoday at 11:12 AM

> a departure from Mamba-2, which optimized for training speed.

?

show 1 reply