The current LLMs are also "magic" so anything is possible. AFAIK there is no proof that the current architecture is optimal. And we have our brains as a pretty powerful local thinking machine as a counter-example to the idea that thinking has to happen in data centers.
I want to ask what makes them magic, but even those building LLMs don't really know what happens when they run inference...
I have to assume current architectures aren't optimal though, the idea that we stumbled into the one and only optimal solution seems almost impossible.