the question remains: is the tokenizer going to be a fundamental limit to my task? how do i know ahead of time?
Would it limit a person getting your instructions in Chinese? Tokenisation pretty much means that the LLM is reading symbols instead of phonemes.
This makes me wonder if LLMs works better in Chinese.
Would it limit a person getting your instructions in Chinese? Tokenisation pretty much means that the LLM is reading symbols instead of phonemes.
This makes me wonder if LLMs works better in Chinese.