I’m not an expert, only dabbled in photogrammetry, but it seems to me that the crux of that problem is identifying common pixels across images in order to sort of triangulate a point in the 3D space. It doesn’t sound like something an LLM would be good at.