logoalt Hacker News

pjmlplast Friday at 2:05 PM2 repliesview on HN

Says those that don't know CUDA.

You can program CUDA in standard C++20, with CUDA libraries hidding the language extensions.

I love when C and C++ dialects are C and C++ when it matters, and not when it doesn't help to sell the ideas being portrayed.


Replies

suuuuuuuulast Friday at 2:18 PM

Sorry, I wasn't aware of these developments (having abandoned CUDA for hardware-agnostic solutions before 2020). It doesn't change my point anyway, if it's specific to a single vendor.

I'm extremely dubious that such an opaque abstraction can actually solve the (true) problem. "Not having to write CUDA" is not enough - how do you tune performance? Parallelization strategies, memory prefetching and arrangement in on-chip caches, when to fuse kernels vs. not... I don't doubt the compiler can do these things, but I do doubt that it can know at compile time what variants of kernel transformations will optimize performance on any given hardware. That's the real problem: achieving an abstraction that still gives one enough control to achieve peak performance.

Edit: you tell me if I'm wrong, but it seems that std::par can't even use shared memory, let alone let one control its usage? If so, then my point stands: C++ is not remotely relevant. Again, avoiding writing CUDA (etc.) doesn't solve the real problem that high-performance language abstractions aim to address.

show 2 replies
bjournelast Friday at 10:44 PM

If CUDA is C++ then I'd like to know how you throw and catch exceptions in CUDA kernels.

show 1 reply