logoalt Hacker News

NiloCKtoday at 1:10 PM0 repliesview on HN

I think the headline oversells this a little?

The reported variance in Sonnet 4.6's estimates here are actually quite low, and in general terms, not so bad across models. Damn paella.

This does seem like a task well suited to a for-purpose training run against a bunch of labelled data. Is there any reason they wouldn't improve at it?