logoalt Hacker News

adwntoday at 3:57 PM1 replyview on HN

> this looks optimized to me.

It's not. Why would lsl+csel or add+csel or cmp+csel ever be faster than a simple add? Or have higher throughput? Or require less energy? An integer addition is just about the lowest-latency operation you can do on mainstream CPUs, apart from register-renaming operations that never leave the front-end.


Replies

DullPointertoday at 4:09 PM

ARM is a big target, there could be cpus where lsl is 1 cycle and add is 2+.

Without knowing about specific compiler targets/settings this looks reasonable.

Dumb in the majority case? Absolutely, but smart on the lowest common denominator.

show 1 reply