Wouldn't it have been simpler (even if technically heavy) to host the game on a single machine and just stream each player's camera? That way all the physics would be computed in real time on one computer, and each player would just receive a different video stream.
Video streams are not known for their low bandwidth needs, let alone adding in RTT latency for inputs.
This seems to get brought up in every hn post on video game mulriplayer, and it makes me wonder: do you play video games? I dont know of any video games that do multiplayer that way, and i would think that alone suggests its not a good idea.
Who wants to play a game with 50ms+ keypress to screen update delay? Sounds miserable.