The deeper you go into the filters (single models, cross correlated by specific languages), the smal...

gertlabs • today at 6:52 AM • 0 replies • view on HN

The deeper you go into the filters (single models, cross correlated by specific languages), the smaller your sample sizes. A known limitation, tbh I doubt Mistral is better than GPT 5.5 at programming in any specific language and probably hit a few lower quality generations by GPT 5.5 by chance (but I could be wrong! We're always adding more samples so data improves over time. We always prioritize largest sample counts for near-frontier models first).

alt Hacker News