Continuous pretraining has issues because it starts forgetting the older stuff. There is some research into other approaches.