I think one could just pick a convention where a particular GP register is zeroed at program startup and just make your own zero register that way, getting all the benefits at very small cost. The microarchitecture AIUI has a dedicated zero register so any processor-level optimizations would still apply.
That’s what was done on the CDC 6600 with two handy values, B0 (0) and B1 (1).