> Putting the processes aside, if you black box a human and a language model and put us head to head on reasoning tasks, sometimes you're going to get quite similar results.
I cannot believe this is true. LLMs are awful at whatever problems are not present in the dataset used for training. They are very bad at planning problems for example, because they cannot possibly memorize every single instance, and they cannot reason to reach a solution, but a black-boxed human of course it can.