It matters if you’re curious about whether AGI is possible. Have we really built “thinking machines”, or are these systems just elaborate harnesses that leverage the non-deterministic nature of LLMs?
What is a difference? If the "elaborate harness" consists of mix of "classical" code and ML model invocations, at which point it's disqualified from consideration for "thinking machine"? Best we can tell, even our brains have parts that are "dumb", interfacing with the parts that we consider "where the magic happens".
An "elaborate harness" that can break down a problem into sub-tasks, write Python scripts for the ones it can't solve itself, and then combine the results, seems able to solve a wide range of cognitive tasks?
At least in theory.