logoalt Hacker News

GianFabienlast Thursday at 9:56 AM2 repliesview on HN

Last time I looked intel CPUs had like 1700 instructions. Every generation comes with an even more expanded ISA. I doubt that compilers use even a fraction of the ISA. Especially considering that binaries are often expected to run on a wide range of older CPUs. I know that there are intrinsic functions which provide access to some of the powerful, yet special purpose instructions. It is unrealistic to expect the compiler to make effective use of all the fancy instructions you paid for with your latest hardware upgrade.


Replies

genewitchlast Thursday at 10:35 AM

> It is unrealistic to expect the compiler to make effective use of all the fancy instructions you paid for with your latest hardware upgrade.

I'd add "yet" - we runinafed that the reason new machines with similar shapes (quad core to quad core of a newer generation) doesn't immediately seem like a large a jump as it ought, in performance, is because it takes time for people other than intel to update their compilers to effectively make use of the new instructions. icc is obviously going to more quickly (in the sense of how long after the CPU is released, not `time`) generate faster executing code on new Intel hardware. But gcc will take longer to catch up.

There's a sweet spot from about 1-4 years after initial release where hardware speeds up, but toward the end of that run programs bloat and wipe all the benefits of the new instructions; leading to needing a new CPU, that isn't that much faster than the one you replaced.

Yet.

Which reminds me I need to benchmark a Linux kernel compile to see if my above supposition is correct, I have the timings from when I first bought it, as compared to a 10 year old HP 40 core machine (ryzen 5950 is 5% faster but used 1/4th the wall power.)

grishkalast Thursday at 2:16 PM

These kinds of SIMD instructions are usually used by things like media codecs and DSPs. They would include several versions of the performance-critical number-crunching code and would pick the best one at runtime depending on which SIMD instructions your CPU supports.