Interesting to see how bad the physics/object permanence is. I wonder if combining this with a Genie 2 type model (Google's new "world model") would be the next step in refining it's capabilities.
Until these models can figure out physics, it seems to me they will be an interesting toy
This feels like computer graphics and the 'screen space' techniques that got introduced in the Xbox 360 generation - reflection, shadows etc. all suffered from the inability to work with off screen information and gave wildly bad answers once off screen info was required.
The solution was simple - just maintain the information in world space, and sample for that. But simple does not mean cheap, and it led to a ton of redundant (as in invisible in the final image) having to be kept track of.