logoalt Hacker News

rob74today at 7:54 AM1 replyview on HN

Ok, then it will be an explosion of binary size, if you have several code blocks optimized for each architecture level - I'm not very familiar with the subject, but I imagine it would have to be relatively large chunks of code, otherwise the constant branching would eat up the speed advantage.


Replies

masklinntoday at 8:49 AM

These are usually pretty tight loops or constructs based on specific features.

An unspecialised popcnt is half the dozen instructions, for specialised versions it’s 4 implementations ranging from half a dozen to two dozen bytes.