logoalt Hacker News

Tiberiumyesterday at 5:11 PM1 replyview on HN

Can you also compare the performance with https://github.com/huggingface/tokenizers/? Would be helpful, since the benchmark in the tiktoken readme seems to be very outdated.


Replies

binarymaxyesterday at 5:39 PM

Anecdotally I've always found tiktoken to be far slower than huggingface tokenizers. I'm not sure why, as I haven't dug into tiktoken, but I'm a heavy user of HF's rust tokenizers