Note that, not only are multiple consecutive increments reduced to zero latency, but that happens even if they're interleaved with movsxd, as in the second experiment at https://uops.info/html-lat/ADL-P/INC_R64-Measurements.html. It'd be interesting to see what other instructions it can "fuse" with (if that is what is happening).
Also interesting that this only happens with 64 bit registers: https://uops.info/html-lat/ADL-P/INC_R32-Measurements.html
I don't see a reason why this should be the case, since the high bits of the result would simply be cleared, and it's a common size optimization to use 32 bit operations.
Maybe https://news.ycombinator.com/item?id=41706743 is correct, and this is mainly intended for address increments generated by microcode?