Everything is still based on 4 4o still right? is a new model training just too expensive? They can consult deepseek team maybe for cost constrained new models.
Apparently they have not had a successful pre training run in 1.5 years
I thought whenever the knowledge cutoff increased that meant they’d trained a new model, I guess that’s completely wrong?
The irony is that Deepseek is still running with a distilled 4o model.
Where did you get that from? Cutoff date says august 2025. Looks like a newly pretrained model