Implementing rotate through carry like that was a really bad decision IMO - it's almost never by more than one bit left or right at a time, and this could be done much more efficiently than with the constant-time code which is only faster when the count is > 6.
Is the full microcode available anywhere?
> For memory operands, there's an additional twist: the bit index is a signed offset that can address bits outside the nominal operand. A bit index of 35 on a dword accesses bit 3 of the next dword in memory.
I wonder what is the use case for testing a bit outside of the memory address given.