I don't know if it demonstrates anything, but I do think it's somewhat natural for people to want to interact with tools that feel like they make sense.
If I'm going to trust a model to summarize things, go out and do research for me, etc, I'd be worried if it made what looks like comprehension or math mistakes.
I get that it feels like a big deal to some people if some models give wrong answers to questions like this one, "how many rs are in strawberry" (yes: I know models get this right, now, but it was a good example at the time), or "are we in the year 2026?"
I don't know if it demonstrates anything, but I do think it's somewhat natural for people to want to interact with tools that feel like they make sense.
If I'm going to trust a model to summarize things, go out and do research for me, etc, I'd be worried if it made what looks like comprehension or math mistakes.
I get that it feels like a big deal to some people if some models give wrong answers to questions like this one, "how many rs are in strawberry" (yes: I know models get this right, now, but it was a good example at the time), or "are we in the year 2026?"