> a departure from Mamba-2, which optimized for training speed.
?
Yes? Mamba-2 optimized for training speed compared to Mamba-1. Mamba-3 adds optimization for inference. These are pretty much version numbers.
Yes? Mamba-2 optimized for training speed compared to Mamba-1. Mamba-3 adds optimization for inference. These are pretty much version numbers.