logoalt Hacker News

ethan_smithyesterday at 3:12 PM1 replyview on HN

Modern compilers (especially Clang 16+/GCC 13+) have become remarkably good at auto-vectorizing regular scalar code with -O3 -march=native, often matching hand-written SIMD without the maintenance burden.


Replies

vlovich123yesterday at 4:35 PM

Depends on the specific loop and that’s where the problem is. For many cases, “good enough performance” is ok. However, in other cases like graphics hotspots, video decoders, LLMs etc, performance is actually a requirement and the compiler being free to not do the right thing is a problem. Also, even if it vectorizes it might not vectorize correctly which still violates the contract and there’s no way to have that contract with the compiler is a problem (ie have it fail to compile instead of producing the wrong code)