See here https:&#x... | alt Hacker News

frde_me • today at 5:16 PM • 1 reply • view on HN

85% of the compute for the final model is from them, and not the base Kimi model.

That just means it cost a lot.

Does it perform meaningfully better than the Kimi model given all that extra compute? And proportionally to the amount spent?

➕ show 2 replies