> I think veo3 proves that ai can generalize 2d and even 3d games
It doesn't. And you said it yourself:
> generating a video under prompt constraints is basically playing a game.
No. It's neither generating a game (that people can play) nor is it playing a game (it's generating a video).
Since it's not a model of the world in any sense of the word, there are issues with even the most basic object permanenece. E.g. here's veo3 generating a GTA-style video. Oh look, the car spins 360 and ends up on a completely different street than the one it was driving down previously: https://www.youtube.com/watch?v=ja2PVllZcsI
It is still doing a great job for a few frames, you could keep it more anchored to the state of the game if you prompt it. Much like you can prompt coding agents to keep a log of all decisions previously made. Permanenece is excellent, it slips often but it mostly because it is not grounded to specific game state by the prompt or by the decision log.