I thought whenever the knowledge cutoff increased that meant they’d trained a new model, I guess tha...

Wowfunhappy • yesterday at 6:28 PM • 2 replies • view on HN

I thought whenever the knowledge cutoff increased that meant they’d trained a new model, I guess that’s completely wrong?

Replies

rockinghigh • yesterday at 7:46 PM

They add new data to the existing base model via continuous pre-training. You save on pre-training, the next token prediction task, but still have to re-run mid and post training stages like context length extension, supervised fine tuning, reinforcement learning, safety alignment ...

➕ show 1 reply

brokencode • yesterday at 6:37 PM

Typically I think, but you could pre-train your previous model on new data too.

I don’t think it’s publicly known for sure how different the models really are. You can improve a lot just by improving the post-training set.

alt Hacker News

Replies