Agreed! I was looking through the summation example <

ashvardanian • 04/24/2025 • 2 replies • view on HN

Agreed! I was looking through the summation example < https://github.com/tracel-ai/cubecl/blob/main/examples/sum_t...> and it seems like the primary focus is on the more traditional pre-2018 GPU programming without explicit warp-level operations, asynchrony, atomics, barriers, or countless tensor-core operations.

The project feels very nice and it would be great to have more notes in the README on the excluded functionality to better scope its applicability in more advanced GPGPU scenarios.

Replies

nathanielsimard • 04/24/2025

We support warp operations, barriers for Cuda, atomics for most backends, tensor cores instructions as well. It's just not well documented on the readme!

➕ show 1 reply

0x7cfe • 04/24/2025

CubeCL is the computation backend for Burn (https://burn.dev/) - ML framework done by the same team which does all the tensor magic like autodiff, op fusion and dynamic graphs.

alt Hacker News

Replies