Until I see evidence that an LLM trained at e.g. the character level _CAN_ successfully "count Rs" then I don't trust this explanation over any other hypothesis. I am not familiar with the literature so I don't know if this has been done, but I couldn't find anything with a quick search. Surely if someone did successfully do it they would have published it.
There are various papers about this, maybe most prominently Byte-Latent Transformer.
The math tokenization research is probably closest.
GPT-2 tokenization was a demonstratable problem: https://www.beren.io/2023-02-04-Integer-tokenization-is-insa... (Prior HN discussion: https://news.ycombinator.com/item?id=39728870 )
More recent research:
https://huggingface.co/spaces/huggingface/number-tokenizatio...
Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs: https://arxiv.org/abs/2402.14903
https://www.beren.io/2024-07-07-Right-to-Left-Integer-Tokeni...