No, it’s an example that shows that LLMs still use a tokenizer, which is not an impediment for almost any task (even many where you would expect it to be, like searching a codebase for variants of a variable name in different cases).
the question remains: is the tokenizer going to be a fundamental limit to my task? how do i know ahead of time?
the question remains: is the tokenizer going to be a fundamental limit to my task? how do i know ahead of time?