I've actually used this fact in a related way, for wayfinding.
Old school Open-CV was able to see tracks well from an onboard monocular camera, but calibration and scale was annoying. Track width is accurate enough that I was able to use it to input a bunch of head-end video to map the tracks.
It was mostly just a modified edge detect where the tracks approximately would be. Once finding the tracks, you could automatically calculate the camera's height, lateral location, and angle.