Sure, but that only goes so far, especially when users aren't writing their shaders with knowledge that this transform is going to be applied or any tools to verify that it's able to eliminate anything.
Well, it is what is done on several tiler architectures, and it generally works just fine. Normally your computations of the position aren't really intertwined with the computation of the other outputs, so dead code elimination does a good job.
Why would it be difficult? There are explicit shader semantics to specify output position.
In fact, Qualcomm's documentation spells this out: https://docs.qualcomm.com/nav/home/overview.html?product=160...