logoalt Hacker News

floriangoebeltoday at 10:35 AM1 replyview on HN

Wouldn't this increase your token usage because the tokenizer now can't process whole words, but it needs to go letter by letter?


Replies

literalAardvarktoday at 1:02 PM

It doesn't go letter by letter, so not with current tokenizers.

There will likely be some internal reasoning going "I wonder if the user meant spell check, I'm gonna go with that one".

And it'll also bias the reasoning and output to internet speak instead of what you'd usually want, such as code or scientific jargon, which used to decrease output quality. I'm not sure if it still does