A "world model" depends on the context which defines which world the problem is in. For chess, which moves are legal and needing to know where the pieces are to make legal moves are parts of the world model. For alpha blending, it being a mathematical operation and the visibility of a background given the transparency of the foreground are parts of the world model.
The examples are from all the major commercial American LLMs as listed in a sister comment.
You seem to conflate context windows with tracking chess pieces. The context windows are more than large enough to remember 10 moves. The model should either track the pieces, or mention that it would be playing blindfold chess absent a board to look at and it isn't good at this, so could you please list the position after every move to make it fair, or it doesn't know what it's doing; it's demonstrably the latter.
If you train an LLM on chess, it will learn that too. You don't need to explain the rules, just feed it chess games, and it will stop making illegal moves at some point. It is a clear example of an inferred world model from training.
https://arxiv.org/abs/2501.17186
PS "Major commercial American LLM" is not very meaningful, you could be using GPT4o with that description.