I’m not technically familiar but I remember someone saying that models like MiniMax basically skip the cost of training by using distillation to basically “steal” the models from OpenAI or Anthropic, and that these companies now have various defenses against this. What happens when MiniMax has to do the full work themselves?
Why would they have to do it themselves?