logoalt Hacker News

camel-cdrlast Saturday at 10:16 AM2 repliesview on HN

My personal gripe with Rust's std::simd in its current form is that it makes writing portable SIMD hard while making non-portable SIMD easy. [0]

> the implementation of sin4f and sin8f are line by line equal, with the exception of types. You can work around this with templates/generics

This is true, I think most SIMD algorithms can be written in such a vector length-agnostic way, however almost all code using std::simd specifies a specific lane count instead of using the native vector length. This is because the API favors the use of fixed-size types (e.g. f32x4), which are exclusively used in all documentation and example code.

If I search github for `f32x4 language:Rust` I get 6.4k results, with `"Simd<f32," language:Rust NOT "Simd<f32, 2" NOT "Simd<f32, 4" NOT "Simd<f32, 8"` I get 209.

I'm not even aware of a way to detect the native vector length using std::simd. You have to use the target-feature or multiversion crate, as shown as the last part of the rust-simd-book [1]. Well, kind of like that, because their suggestion using "suggested_vector_width", which doesn't exist. I could only find a suggested_simd_width.

Searching for "suggested_simd_width language:Rust", we are now down to 8 results, 3 of which are from the target-feature/multiversion crates.

---

What I'm trying to say is that, while being able to specify a fixed SIMD width can be useful, the encouraged default should be "give me a SIMD vector of the specified type corresponding to the SIMD register size". If your problem can only be solved with a specific vector length, great, then hard-code the lane count, but otherwise don't.

See [0] for more examples of this.

[0] https://github.com/rust-lang/portable-simd/issues/364#issuec...

[1] https://calebzulawski.github.io/rust-simd-book/4.2-native-ve...


Replies

exDM69last Saturday at 10:46 AM

The binary portability issue is not specific to Rust or std::simd. You would have to solve the same problems even if you use intrinsics or C++. If you use 512 but vectors in the code, you will need to check if the CPU supports it or add multiversioning dispatch or you will get a SIGILL.

I have written both type generic (f32 vs f64) and width generic (f32x4 vs f32x8) SIMD code with Rust std::simd.

And I agree it's not very pretty. I had to resort to having a giant where clause for the generic functions, explicitly enumerating the required std::ops traits. C++ templates don't have this particular issue, and I've used those for the same purpose too.

But even though the implementation of the generic functions is quite ugly indeed, using the functions once implemented is not ugly at all. It's just the "primitive" code that is hairy.

I think this was a huge missed opportunity in the core language, there should've been a core SIMD type with special type checking rules (when unifying) for this.

However, I still think std::simd is miles better than intrinsics for 98% of the SIMD code I write.

The other 1% (times two for two instruction sets) is just as bad as it is in any other language with intrinsics.

The native vector width and target-feature multiversioning dispatch are quite hairy. Adding some dynamic dispatch in the middle of your hot loops can also have disastrous performance implications because they tend to kill other optimizations and make the cpu do indirect jumps.

Have you tried just using the widest possible vector size? e.g. f64x64 or something like it. The compiler can split these to the native vector width of the compiler target. This happens at compile time so it is not suitable if you want to run your code on CPUs with different native SIMD widths. I don't have this problem with the hardware I am targeting.

Rust std::simd docs aren't great and there have been some breaking changes in the few years I've used it. There is certainly more work on that front. But it would be great if at least the basic stuff would get stabilized soon.

show 2 replies
MangoToupelast Saturday at 11:05 AM

> portable SIMD

this seems like an oxymoron

show 1 reply