logoalt Hacker News

phillc73today at 6:15 AM1 replyview on HN

Thanks for that, I hadn't scrolled down far enough.

Just want to be sure I'm reading the results correctly... When I compare GPT-5.5 with Mistral Medium 3.5, I see in the tables:

a) Mistral beats GPT in Java and C++

b) It's close for Rust

c) GPT-5.5 easily wins for Go, Javascript, Python and Typescript

Model choice really does appear to be language dependent (assuming I'm reading the results correctly).


Replies

gertlabstoday at 6:52 AM

The deeper you go into the filters (single models, cross correlated by specific languages), the smaller your sample sizes. A known limitation, tbh I doubt Mistral is better than GPT 5.5 at programming in any specific language and probably hit a few lower quality generations by GPT 5.5 by chance (but I could be wrong! We're always adding more samples so data improves over time. We always prioritize largest sample counts for near-frontier models first).