thats only because kimi 2.5 was trained using data stolen from claude. it wouldnt exist without riding claudes coat tails. none of the so called 'open source' models would
Boo hoo. Claude was trained using data stolen from the collective works of all of humanity. If someone does it faster and cheaper by skimming the cream off the top of Claude then surely that’s just a market efficiency in the thieves business?
That's not true, some open weight models didn't distill Claude or other then frontier models. E.g Llama. Yet achieved comparable performance (back then in llama's case).
If distillation wasn't a thing, they would certainly exist, they would have trained them from scratch or via a decent base models to remain economically viable.
What's for sure is that Claude wouldn't exist if it wasn't for data stolen from millions of creators. As they found themselves admittedly guilty of.