I have no real quibble with the blog post itself, but I take issue with the title that calls it a "vintage model".
The blog post defines a "vintage model" as one that is trained only on data before a particular cutoff point:
> Vintage LMs are contamination-free by construction, enabling unique generalization experiments [...] The most important objective when training vintage language models is that no data leaks into the training corpus from after the intended knowledge cutoff
But as they acknowledge later, there are multiple major data leakage issues in their training pipeline, and their model does in fact have quite a bit of anachronistic knowledge. So it fails at what they call the most important objective. It's fair to say that they are working toward something that meets their definition of "vintage", but they're not there yet.
Yeah, the blog distinguishes between "contamination," which it describes as polluting the training data with answers to benchmarking questions, with "temporal leakage," which is polluting the training data with writing after the target date, but those seem to be nearly the same problem.