CUDA isn't really used for new code. Its used for legacy codebases.
In the LLM world, you really only see CUDA being used with Triton and/or PyTorch consumers that haven't moved onto better pastures (mainly because they only know Python and aren't actually programmers).
That said, AMD can run most CUDA code through ROCm, and AMD officially supports Triton and PyTorch, so even the academics have a way out of Nvidia hell.
If you're not doing machine code by hand, you're not a programmer
> CUDA isn't really used for new code.
I don't think this is particularly correct, or at least worded a bit too strongly.
For Nvidia hardware, CUDA just gives the best performance, and there are many optimized libraries that you'd have to replace as well.
Granted, new ML frameworks tend to be more backend agnostic, but saying that CUDA is no longer being used, seems a bit odd.