logoalt Hacker News

danielmarkbruce01/22/20251 replyview on HN

Maybe go back and read what I said rather than make up nonsense. 'often fail' isn't 'always fail'. And many models fail the strawberry example, that's why it's famous. I even lay out some training samples that are of the type that enable current models to succeed at spelling 'games' in a fragile way.

Problematic and fragile at spelling games compared to using character or byte level 'tokenization' isn't a giant deal. These are largely "gotchas" that don't reduce the value of the product materially. Everyone in the field is aware. Hyperbole isn't required.

Someone linked you to one of the relevant papers above... and you still contort yourself into a pretzel. If you can't intuitively get the difficulty posed by current tokenization, and how character/byte level 'tokenization' would make those things trivial (albeit with a tradeoff that doesn't make it worth it) maybe you don't have the horsepower required for the field.


Replies

HarHarVeryFunny01/23/2025

Did you actually read the CUTE paper ?!

What character level task does it say is no problem for multi-char token models ?

What kind of tasks does it say they do poorly at ?

Seems they agree with me, not you.

But hey, if you tried spelling vs counting for yourself you already know that.

You should swap your brain out for GPT-1. It'd be an upgrade.

show 1 reply