logoalt Hacker News

matrix2596yesterday at 4:25 PM1 replyview on HN

is is possible for your tokenizer to give different tokenization ever then openai tokenizer? i am asking because there are multiple ways to tokenize the same string?? sry if i am mistaken


Replies

matthewolfeyesterday at 4:26 PM

Should be the same. Both use Byte-Pair Encoding (BPE) as underlying algo.