It doesn’t need to know about QM or reactivity just about the building blocks that led to them. Which were more than around in the year 1900.
In fact you don’t want it to know about them explicitly just have enough background knowledge that you can manage the rest via context.
LLMs are models that predict tokens. They don't think, they don't build with blocks. They would never be able to synthesize knowledge about QM.
I was vague. My point is that I don't think the building blocks are in the data. Its mainly tertiary and popular sources. Maybe if you had the writings of Victorian scientists, both public and private correspondence.