the main 4 i see are: 1. use-after-free, drop semantics vs manual cudaFree 2. kernel args enforc...

arpadav • today at 6:24 PM • 0 replies • view on HN

the main 4 i see are:

1. use-after-free, drop semantics vs manual cudaFree

2. kernel args enforced using `cuda_launch!` whereas CPP void* args is just an array of pointers, validating count only

3. alias mutable writes. e.g. CPP can have more than one thread writing out[i] with same i and this will compile. but DisjointSlice<T> with ThreadIndex doesnt have any public constructor (see: https://github.com/NVlabs/cuda-oxide/blob/2a03dfd9d5f3ecba52...) and only using API of `index_1d` `index_2d` and `index_2d_runtime`

4. im pretty sure you can cuda memcpy a std::string and literally any other POD and "corrupt" its state making it unusable. here it ONLY accepts DisjointSlice<T>, scalars, and closures (https://nvlabs.github.io/cuda-oxide/gpu-programming/memory-a...)

but most of the nitty gritty is in these sections

* https://nvlabs.github.io/cuda-oxide/gpu-safety/the-safety-mo...

* https://nvlabs.github.io/cuda-oxide/gpu-programming/memory-a...

edit: that being said, not like this catch everything, just looks to give much more guardrails against UB with raw .cu files

alt Hacker News