"A lot of code can be pessimized by golfing instruction counts"
Can you explain what this phrase means?
I’ve done this: I had a hot loop and I discovered that I could reduce instruction counts by adding a branch inside the loop. Definitely slower, which I expected, but it’s worth measuring.
An old approach to micro-optimization is to look at the generated assembly, and trying to achieve the same thing with fewer instructions. However, modern CPUs are able to execute multiple instructions in parallel (out-of-order execution), and this mechanism relies on detecting data dependencies between instructions.
It means that the shorter sequence of instructions is not necessarily faster, and can in fact make the CPU stall unnecessarily.
The fastest sequence of instructions is the one that makes the best use of the CPU’s resources.