logoalt Hacker News

fhubtoday at 3:11 AM0 repliesview on HN

From the post

Profiling TikToken’s Python/Rust implementation showed a lot of time was spent doing regex matching. Most of my perf gains come from a) using a faster jit-compiled regex engine; and b) simplifying the algorithm to forego regex matching special tokens at all.