The extra time was being taken up in the identical assembly language function.
Fixing it took wizardry.