I think LLVM should be able to handle up to less than 65536-bit vectors? At least I saw some llvm is...

dzaima • last Saturday at 2:19 PM • 1 reply • view on HN

I think LLVM should be able to handle up to less than 65536-bit vectors? At least I saw some llvm issue noting that that's where some things broke down; so a 32kbit vector would be f64x512. But I meant an exaggeration as far as actually using it goes, f64x64 is hilariously overkill even on AVX-512, and utterly awful on pre-AVX512.

Replies

exDM69 • last Saturday at 3:00 PM

It's actually Rust that is setting the 64 wide limit (see SupportedLaneCount), not LLVM.

I agree, f64x64 is probably a very bad idea.

But something like f32x8 would probably still be "fast enough" on old/mobile CPUs without 256 wide vectors (but good 128 bit SIMD ALU).

I did something like this when using a u16x16 bitmask fit the problem domain. Most of my target CPUs have 256 wide registers but on mobile ARM land they don't. This wasn't particularly performance sensitive code so I just used 256 bit wide vectors anyway. It wasn't worth it trying to optimize for the old CPUs separately.

alt Hacker News

Replies