> Note: we are not releasing any post-trained / IT checkpoints. I get not trying to cannib...

minimaxir • last Thursday at 8:58 PM • 2 replies • view on HN

> Note: we are not releasing any post-trained / IT checkpoints.

I get not trying to cannibalize Gemma, but that's weird. A 540M multimodel model that performs well on queries would be useful and "just post-train it yourself" is not always an option.

Replies

jeffjeffbear • last Thursday at 9:11 PM

Isn't finetuning the point of the T5 style models, since they perform better for smaller parameter counts?

➕ show 1 reply

sundarurfriend • yesterday at 3:37 AM

This made me compare the figures, and: did they accidentally switch those around, or are the Post-training Reasoning and Factuality scores actually significantly lower than the Pre-training ones?

Edit: Just noticed

> Also note pre-training and post-training benchmarks are different, so scores are not comparable across plots.

The paper gives more details about the specific benchmarks and the scores obtained in them: https://arxiv.org/html/2512.14856v1#S4

alt Hacker News

Replies