logoalt Hacker News

htrptoday at 12:32 PM0 repliesview on HN

I think its part of the expectation setting (with a side of we did our distillation/ eval harness on a specific model).

if they say it's 4.7 comparable, it anchors that into your head as the model to evaluate against.