logoalt Hacker News

MangoToupelast Saturday at 1:30 PM2 repliesview on HN

> there's no fundamental reason why the same source code shouldn't be able to compile down to

Well if you don't describe your code and dataflow in a way that caters to the shape of the SIMD, it seems ridiculous to expect.

But yes, as always, compilers could be magic programs that transform code into perfectly optimal programs. They could also fix bugs for us, too.


Replies

mort96last Saturday at 1:35 PM

I'm saying that the shape of the SIMD is pretty much the same across platforms. Vector width differs between architectures, and whether the vector width is determined at compile time or at runtime differs between architectures, but you'll have to convince me that the vector width is such an essential component of the abstract description of the computation that you fundamentally can't abstract it away. (In fact, the success of RVV and ARM SVE should tell us that we can describe SIMD computation in a vector width-independent way.)

All vector instruction sets offer things like "multiply/add/subtract/divide the elements in two vector registers", so that is clearly not the part that's impossible to describe portably.

show 1 reply
exDM69last Saturday at 2:00 PM

> if you don't describe your code and dataflow in a way that caters to the shape of the SIMD

But when I do describe code, dataflow and memory layout in a SIMD friendly way it's pretty much the same for x86_64 and ARM.

Then I can just use `a + b` and `f32x4` (or its C equivalent) instead of `_mm_add_ps` and `_mm128` (x86_64) or `vaddq_f32` and `float32x4_t` (ARM).

Portable SIMD means I don't need to write this code twice and memorize arcane runes for basic arithmetic operations.

For more specialized stuff you have intrinsics.

show 1 reply