logoalt Hacker News

slashdavetoday at 4:59 AM2 repliesview on HN

I was surprised by the ranking, until I read what the test was. Not horribly relevant for coding.

The current ranking of all tests makes more sense (well, except for how well Gemini does)

https://aicc.rayonnant.ai


Replies

SeriousMtoday at 7:36 AM

The ranking of gold medals only makes sense if all models would gave participate all tests.

DNP = Did not participate

In this regard, kimi got more and better medals than Claude.

mpegtoday at 7:06 AM

If you look at the ranking breakdown though, Kimi K2.6 has only participated in the last 5 challenges (claude dominated before then) and if you only count those it would be in first place