> a departure from Mamba-2, which optimized for training speed. ?

estearum • today at 11:12 AM • 1 reply • view on HN

> a departure from Mamba-2, which optimized for training speed.

Yes? Mamba-2 optimized for training speed compared to Mamba-1. Mamba-3 adds optimization for inference. These are pretty much version numbers.

alt Hacker News