logoalt Hacker News

simonaskyesterday at 4:44 PM1 replyview on HN

You didn’t really say that, but feel free to share any reasons you might have to think so.

I don’t see any reason why it wouldn’t be perfectly fine on recent hardware, where unaligned loads are just as fast, and the cache pressure is identical for a linear search algorithm.


Replies

jstimpfleyesterday at 7:37 PM

I asked where is the part about unaligned pointers in your string processing example. Saying that you want to load multiple bytes at a time does not imply at all that you have to do unaligned loads.

Doing unaligned loads using SSE or AVX might have been possible on Intel architectures for a long time, but it is still a little bit slower afaik. But anyway when you get into sub-architecture specific details like that, you've essentially left C-land, and you're essentially doing assembler level programming.