logoalt Hacker News

drob518yesterday at 4:31 PM1 replyview on HN

Yea, I get that the actual comparison instruction itself is insignificant. It's everything that goes along with it. Seems like quaternary is fetching more data, however.

For instance, if you have 8 elements, 01234567, and you're looking for 1, with binary, you'd fetch 4, 2, and then 1. With quaternary, you'd fetch 2, 4, 6, then 1. Obviously, if you only have 8 elements, you'd just delegate to the SIMD instruction, but if this was a much larger array, you'd be doing more work.

I guess on a modern processor, eliminating the data dependency is worth it because the processor's branch prediction and speculation only follows effectively a single path.

Would be interesting to see this at a machine cycle level on a real processor to understand exactly what is happening.


Replies

LoganDarkyesterday at 5:06 PM

It's not about doing more or less work; it's about doing the work faster. For instance, it's relatively common to discover that some recomputation can be faster than caching or lookup tables. Similarly, fetching more from memory also can be faster if it means you make less roundtrips.

show 1 reply