logoalt Hacker News

ashvardanian04/24/20252 repliesview on HN

Agreed! I was looking through the summation example < https://github.com/tracel-ai/cubecl/blob/main/examples/sum_t...> and it seems like the primary focus is on the more traditional pre-2018 GPU programming without explicit warp-level operations, asynchrony, atomics, barriers, or countless tensor-core operations.

The project feels very nice and it would be great to have more notes in the README on the excluded functionality to better scope its applicability in more advanced GPGPU scenarios.


Replies

nathanielsimard04/24/2025

We support warp operations, barriers for Cuda, atomics for most backends, tensor cores instructions as well. It's just not well documented on the readme!

show 1 reply
0x7cfe04/24/2025

CubeCL is the computation backend for Burn (https://burn.dev/) - ML framework done by the same team which does all the tensor magic like autodiff, op fusion and dynamic graphs.