> Other companies were allegedly distilling the models by training on the reasoning output
In the case of makers of open-source models (which are also competition), there is no allegedly, they were (and still are) openly doing that.
In the case of the closed models too... Claude would happily tell you it was deepseek-v3 if you asked in chinese until it caught public attention and they papered over it.
In the case of the closed models too... Claude would happily tell you it was deepseek-v3 if you asked in chinese until it caught public attention and they papered over it.