LLM models can not do spacial reasoning. I haven't tried with GPT, however, Claude can not solve a Rubik Cube no matter how much I try with prompt engineering. I got Opus 4.6 to get ~70% of the puzzle solved but it got stuck. At $20 a run it prohibitively expensive.
The point is if we can prompt an LLM to reason about 3 dimensions, we likely will be able to apply that to math problems which it isn't able to solve currently.
I should release my Rubiks Cube MCP server with the challenge to see if someone can write a prompt to solve a Rubik's Cube.
What about a model designed for robotics and vision? Seems like an LLM trained on text would inherently not be great for this.
DeepMinds other models however might do better?
*yet
> I should release my Rubiks Cube MCP server with the challenge to see if someone can write a prompt to solve a Rubik's Cube.
Do it, I'm game! You nerdsniped me immediately and my brain went "That sounds easy, I'm sure I could do that in a night" so I'm surely not alone in being almost triggered by what you wrote. I bet I could even do it with a local model!