So is the SIMD the magic piece here, or is it the interpolation search? If the data is evenly distributed, that is pretty optimal for the interpolation search..
In the Intel CPU + cold cache case, the quad search matters. In the other three cases, only the SIMD matters.
In the Intel CPU + cold cache case, the quad search matters. In the other three cases, only the SIMD matters.