logoalt Hacker News

sfmikeyesterday at 6:21 PM4 repliesview on HN

Everything is still based on 4 4o still right? is a new model training just too expensive? They can consult deepseek team maybe for cost constrained new models.


Replies

elgatolopezyesterday at 6:34 PM

Where did you get that from? Cutoff date says august 2025. Looks like a newly pretrained model

show 2 replies
verdvermyesterday at 6:24 PM

Apparently they have not had a successful pre training run in 1.5 years

show 2 replies
Wowfunhappyyesterday at 6:28 PM

I thought whenever the knowledge cutoff increased that meant they’d trained a new model, I guess that’s completely wrong?

show 2 replies
catigulayesterday at 6:35 PM

The irony is that Deepseek is still running with a distilled 4o model.

show 1 reply