What?!?
NVidia designs CUDA hardware specifically for the C++ memory model, they went through the trouble to refactor their original hardware across several years, so that all new cards would follow this model, even if PTX was designed as polyglot target.
Additionally, ISO C++ papers like senders/receivers are driven by NVidia employees working on CUDA.
CUDA is not C++. CUDA for GPU kernels is its own language. That's the actual problem requiring new languages or abstractions.