They can figure out a fair bit of physics. It's not a "no physics" vs "physics" thing. Rather it's a "flawed and unreliable physics" thing.
It's similar to the LLM hallucination problem. LLMs produce nonsense and untruths - but they are still useful in many domains.
It's a pretty binary thing in the sense that "bad physics" pretty quickly decoheres into no physics.
I saw one of these models doing a Minecraft like simulation and it looked sort of okay but then water started to end up in impossible places and once it was there it kept spreading and you ended up in some lovecraftian horror dimension. Any useful physics simluation at least needs boundary conditions to hold and these models have no boundary conditions because they have no clear categories of anything.