Tell that to German-speakers, where the verb comes last, and the order of things in sentences is not anything like English, therefore requiring you to think of the entire sentence before you just spit it out. Even the numbers are backwards (twenty-two is two-and-twenty) which requires thinking.
Furthermore, when you ask an LLM to count how many r's are in the word strawberry, it will give you a random answer, "think" about it, and give you another random answer. And I guarantee you out of 3 attempts, including reasoning, it will flip-flop between right and wrong, but unlike a human, it will be random, because, unlike humans who, when asked "how many r's are in the word strawberry" will not be able to tell you the correct answer every. fucking. time.
edit: formatting
The part about strawberry is just not right. That problem was solved. And I do think it's a stretch to say German speakers think of the entire sentence before speaking it.
It seems models are pre-planning though:
> How does Claude write rhyming poetry? Consider this ditty:
> He saw a carrot and had to grab it,
> His hunger was like a starving rabbit
> To write the second line, the model had to satisfy two constraints at the same time: the need to rhyme (with "grab it"), and the need to make sense (why did he grab the carrot?). Our guess was that Claude was writing word-by-word without much forethought until the end of the line, where it would make sure to pick a word that rhymes. We therefore expected to see a circuit with parallel paths, one for ensuring the final word made sense, and one for ensuring it rhymes.
> Instead, we found that Claude plans ahead. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word.
[https://www.anthropic.com/research/tracing-thoughts-language...]