Object permanence and a communications channel is enough for this. Give children (who get along with each other) a pile of sticks and leave them alone for half an hour, and there's half a chance their game will ignore the sticks. Most children wouldn't want to have their play mediated by the computer in the way you describe, because the ergonomics are so poor.
The majority of American children have an active Roblox account. Those who don't are likely to play Minecraft or Fortnite. Play mediated by the computer in this way is already one of the most popular forms of play. Kids are going to go absolutely nuts for this and if you think otherwise, you really need to talk to some children.
I'm reminded of that guy who bought an AI enabled toy for his daughter and got increasingly exasperated as she kept turning it off and treating it as a normal toy.
https://xcancel.com/altryne/status/1872090523420229780