I think you're conflating GPU 'threads' and 'warps'. GPU 'threads...

zozbot234 • yesterday at 7:51 PM • 1 reply • view on HN

I think you're conflating GPU 'threads' and 'warps'. GPU 'threads' are SIMD lanes that are all running with the exact same instructions and control flow (only with different filtering/predication), whereas GPU warps are hardware-level threads that run on a single compute unit. There's no issue with adding extra "don't run code" when using warps, unlike GPU threads.

Replies

textlapse • yesterday at 8:38 PM

My understanding of warp (https://docs.nvidia.com/cuda/cuda-programming-guide/01-intro...) is that you are essentially paying the cost of taking both the branches.

I understand with newer GPUs, you have clever partitioning / pipelining in such a way block A takes branch A vs block B that takes branch B with sync/barrier essentially relying on some smart 'oracle' to schedule these in a way that still fits in the SIMT model.

It still doesn't feel Turing complete to me. Is there an nvidia doc you can refer me to?

➕ show 1 reply

alt Hacker News

Replies