logoalt Hacker News

usamoitoday at 2:11 PM1 replyview on HN

This code is not equivalent to the C++ version. You can directly use `*x == [0_u32; SIZE]`. The code generated by the two is different. (But the iterator version not producing optimal code is also an issue.)


Replies

gsprtoday at 2:54 PM

Very good point! Thanks!

With the correction, it interestingly enough produces the good behavior also at size=2. It also delays SIMD until size=5. But then it bizarrely stops doing SIMD again after size=64.

https://godbolt.org/z/P979nY4nf

The iterator version stays SIMD-y also after size=64, but stops at some point. What?! I don't know enough to understand what's going on. Anyone?

show 1 reply