That's pretty cool.
Normally it would be the either the programmer's or the compiler's job to unroll a loop and then reduce dependency chain lengths.
But its nice if the renamer can do that as well.
Presumably intel have real-world data that suggest that significant real workloads can profit from this.
I wonder whether that points to specific software issues, like hypothetically "oh yeah, openjdk8 hotspot was a little too timid at loop unrolling. It won't get that JIT improvement backported, but our customers will use java8 forever. Better fix that in silicon".