> How many Rs are in the word strawberry?
That's a known flaw that builders have decided to swallow, as opposed an intentional aspect of the design. The intent of the design is to generate text that humans find convincing. If they could tweak the design to remove tokenisation flaws, they'd do it instantly.
> he should have dumped a position article he agrees with, and a position article he disagrees with, into it, and asked it to compare, contrast...
This is like saying that pen testers shouldn't use special characters in API requests. I don't think the author's goal was to showcase an optimal use-case, but how LLMs can easily and unwittingly provide incorrect information. Of course this is already known, but it sounds like he felt obliged to demonstrate it for this specific case where the creator claims that it is robust.