I'm trying to work towards that goal by training a model on mostly German science texts up to 1904 (before the world wars German was the lingua franca of most sciences).
Training data for a base model isn't that hard to come by, even though you have to OCR most of it yourself because the publicly available OCRed versions are commonly unusably bad. But training a model large enough to be useful is a major issue. Training a 700M parameter model at home is very doable (and is what this TimeCapsuleLLM is), but to get that kind of reasoning you need something closer to a 70B model. Also a lot of the "smarts" of a model gets injected in fine tuning and RL, but any of the available fine tuning datasets would obviously contaminate the model with 2026 knowledge.
Can we follow along with your work / results somewhere?
I am a historian and am putting together a grant application for a somewhat similar project (different era and language though). Would you be open to discussing a collaboration? My email is bebreen [at] ucsc [dot] edu.