https://... | alt Hacker News

dr_dshiv • yesterday at 1:36 PM • 1 reply • view on HN

LM Arena shows Claude Opus 4.5 on top

Replies

HarHarVeryFunny • yesterday at 1:57 PM

I wonder how model competence and/or user preference on web development (that leaderboard) carries over to more complex and larger projects, or more generally anything other than web development ?

In addition to whatever they are exposed to as part of pre-training, it'd be interesting to know what kind of coding tasks these models are being RL-trained for? Are things like web development and maybe Python/ML coding overemphasized, or are they also being trained on things like Linux/Windows/embedded development etc in different languages?