logoalt Hacker News

GianFabientoday at 6:43 AM2 repliesview on HN

I think the more critical question is how well compiler writers can update the heuristics which identify the instruction sequences that benefit from the architectural features. Last I looked, Intel has several thousand intrinsics which must be explicitly invoked to make use of specific features.

I suspect that heavily optimised code either uses intrinsics or carefully written assembler code.


Replies

fweimertoday at 7:17 AM

Newer (relatively speaking) x86-64 instruction sets support many three-operand instructions, which are actually easier to use for compilers than instructions with overwritten source operands or hard register constraints. Pattern matching for instructions that do not have a direct C representation (such as NAND) is also pretty standard in compilers. Auto-vectorization is more tricky (especially when you want code to actually run faster …), but some of the new ISAs are impactful without it. And of course there are expanders for fixed-size memcpy and memset that can use wider vector instructions quite easily. Those operations are quite common.

waherntoday at 7:06 AM

I think both AMD and Intel employ and/or fund GCC and LLVM developers to add support for each new architecture. Compiler and product release schedules are independent so the target and tuning support in the latest compiler release may be slightly behind or even ahead of the latest microarchitecture release. GCC 16.1 has support for Zen 6, which has even been released, yet. (https://gcc.gnu.org/gcc-16/changes.html#x86)