I believe these are SIMD. Tensor cores require MMA family of instructions. Ask me how I know. :)
https://github.com/m4rs-mt/ILGPU/compare/master...lostmsu:IL...
Good article: https://alexarmbr.github.io/2024/08/10/How-To-Write-A-Fast-M...