You're right - Rubin is better at NVFP4 training, not inference, thank you for catching me!
What does it mean it's better at nvfp4 training? What's different between training and inference to make this true?
What does it mean it's better at nvfp4 training? What's different between training and inference to make this true?