LLMs were not designed to count letters[0] since they work with tokens, so whatever trick they are n...

lossyalgo • today at 10:00 AM • 1 reply • view on HN

LLMs were not designed to count letters[0] since they work with tokens, so whatever trick they are now doing behind the scenes to handle this case, can probably only handle this particular case. I wonder if it's now included in the system prompt. I asked ChatGPT and it said it's now using len(str) and some other python scripts to do the counting, but who knows what's actually happening behind the scenes.

[0] https://arxiv.org/pdf/2502.16705

Replies

ACCount37 • today at 10:34 AM

There's no "trick behind the scenes" there. You can actually see the entire trick being performed right in front of you. You're just not paying attention.

That trick? The LLM has succeeded by spelling the entire word out letter by letter first.

It's much easier for an LLM to perform "tokenized word -> letters -> letter counts" than it is to perform "tokenized word -> letter counts" in one pass. But it doesn't know that! It copies human behavior from human text, and humans never had to deal with tokenizer issues in text!

You can either teach the LLM that explicitly, or just do RLVR on diverse tasks and hope it learns the tricks like this by itself.

alt Hacker News

Replies