> I wonder why scores on TriviaQA vis-a-vis 14b model lags behind Gemma 12b so much; that one is ...

NitpickLawyer • today at 4:05 PM • 0 replies • view on HN

> I wonder why scores on TriviaQA vis-a-vis 14b model lags behind Gemma 12b so much; that one is not a formatting-heavy benchmark.

My guess is the vast scale of google data. They've been hoovering data for decades now, and have had curation pipelines (guided by real human interactions) since forever.

alt Hacker News