logoalt Hacker News

bcatanzarolast Tuesday at 3:06 PM0 repliesview on HN

The Nano model isn’t pretrained in FP4, only Super and Ultra are. And posttraining is not in FP4, so the posttrained weights of these models are not native FP4.