See here https://cursor.com/blog/composer-2-5
85% of the compute for the final model is from them, and not the base Kimi model.
That just means it cost a lot.
Does it perform meaningfully better than the Kimi model given all that extra compute? And proportionally to the amount spent?
That just means it cost a lot.
Does it perform meaningfully better than the Kimi model given all that extra compute? And proportionally to the amount spent?