"Distillation" from APIs is not a thing, it cannot replicate a model's deep reasoning and behavior.
I'm uneducated on how distillation works at more than a basic level so forgive me if this is a stupid question.
Isn't "distillation" of another provider's model exactly how these models got training date in the first place: Massive amounts of the written word + Prompt -> Answer. Why wouldn't distillation produce similar "reasoning" in the new model? It's just inputs and outputs.
This is totally inaccurate, the APIs provide the reasoning logs. You ABSOLUTELY can distill from APIs, in fact, that's the primary way distillation is done currently.
I struggle with the practicality of the whole thing.
The amount of tokens required to properly distill a frontier model is so large that by the time you could consume the # of tokens you would either be banned for extremely obvious abuse or a new model would be released, rendering your efforts less and less valuable over time. Intelligence is not a linear thing. Being behind just a little bit can have exponential consequences.