MI355X can perform FP6 operations with the same speed as their FP4 (unique to AMD) - people should be making MXFP6 quants which would be pretty much lossless, and much closer to FP4 performance than FP8