This seems like a mathematician trying to shoehorn a closed form solution on to something which is architecture dependent and is probably better summarized with: Only operate on values which fit inside, ideally the l2 but certainly l3 cache of your target machine for performance sensitive code paths.