Comparing the tokenizer to sensory processing is a great analogy. That's exactly what your visual cortex and initial layers of the language center are doing: decoding visual representation of text into the internal neural representation.
It's a learned mapping from one representation to another, not some semantic lookup against an exogenous source.