The crazy part about this is that (auto) vectorization in Rust looks something like this: iter.chunks(32).map(vectorized)
Where the vectorized function checks if the chunk has length 32, if yes run the algorithm, else run the algorithm.
The compiler knows that the chunk has a fixed size at compile time in the first block, which means it can now attempt to vectorize the algorithm with a SIMD size of 32. The else block handles the scalar case, where the chunk is smaller than 32.