GP means they aren't good at knowing when they are wrong and should spend more compute on the problem.
I would say the current generation of LLMs that "think harder" when you tell them their first response is wrong is a training grounds for knowing to think harder without being told, but I don't know the obstacles.
Are you suggesting that when you tell it "think harder" it does something like "pass a question to a bigger system"? I have doubts... It would be gated behind more expensive plan if so