This is not a correct approximation of what happens inside an LLM. They form probabilistic logical circuits which approximate the world they have learned through training. They are not simply recalling stored facts. They are exploiting organically-produced circuitry, walking a manifold, which leads to the ability to predict the next state in a staggering variety of contexts.
As an example: https://arxiv.org/abs/2301.05217
It's not hard to imagine that a sufficiently developed manifold could theoretically allow LLMs to interpolate or even extrapolate information that was missing from the training data, but is logically or experimentally valid.
So you do agree that an LLM cannot derive math from first principals, or no? If an LLM had only ever seen 1+1=2 and that was the only math they were ever exposed to, along with the numbers 0-10, could an LLM figure out that 2+2=4?
I argue absolutely not. That would be a fascinating experiment.
Hell, train it on every 2-number addition combination of m+n where m and n can be any number between 1-100 (or 0-100 would be better) BUT 2, and have it figure out what 2+2 is.
I would probably change my opinion about “circuits”, which by the way really stretches the idea of a circuit. The “circuit” is just the statistically most likely series of tokens that you’re drawing pretend lines between. Sure, technically connect-the-dots is a circuit, but not in the way you’re implying, or that paper.
You could find a pre-print on Arxiv to validate practically any belief. Why should we care about this particular piece of research? Is this established science, or are you cherry-picking low-quality papers?