> Most human math phd's have all kinds of shortcomings.
I know a great many people with PhDs. They're certainly not infallible by any means, but I can assure you, every single one of them can correctly count the number of occurrences of the letter 'r' in 'strawberry' if they put their mind to it.
I'll bet said phds can't answer the equivalent question in a language they don't understand. LLMs don't speak character level english. LLMs are, in some stretched meaning of the word, illiterate.
If LLMs used character level tokenization it would work just fine. But we don't do that and accept the trade off. It's only folks who have absolutely no idea how LLMs work that find the strawberry thing meaningful.
I know a great many people with PhDs. They're certainly not infallible by any means, but I can assure you, every single one of them can correctly count the number of occurrences of the letter 'r' in 'strawberry' if they put their mind to it.
So can the current models.
It's frustrating that so many people think this line of reasoning actually pays off in the long run, when talking about what AI models can and can't do. Got any other points that were right last month but wrong this month?
Humans tasked to count how many vowels are in "Pneumonoultramicroscopicsilicovolcanoconiosis" (a real word), without seeing the word visually, just from language, would struggle. Working memory limits. We're not that different, we fail too.