Two answers: 1 - ChatGPT isn't an LLM, its an application using one/many LLMs and other tools (likely routing that to a split function).
2 - even for a single model 'call':
It can be explained with the following training samples:
"tree is spelled t r e e" and "tree has 2 e's in it"
The problem is, the LLM has seen something like:
8062, 382, 136824, 260, 428, 319, 319
and
19816, 853, 220, 17, 319, 885, 306, 480
For a lot of words, it will have seen data that results in it saying something sensible. But it's fragile. If LLMs used character level tokenization, you'd see the first example repeat the token for e in tree rather than tree having it's own token.
There are all manner of tradeoffs made in a tokenization scheme. One example is that openai made a change in space tokenization so that it would produce better python code.