> CUDA solved that problem.
CUDA is a proprietary Nvidia product. CUDA solved the problem for Nvidia chips.
On AMD GPUs, you use ROCm. On Intel, you use OpenVINO. On Apple silicon you use MLX. All work fine with all the common AI tasks you'd want to do on self-hosted hardware. CUDA was there first and so it has a more mature ecosystem, but, so far, I've found 0 models or tasks I haven't been able to use with ROCm. llama.cpp works fine. ComfyUI works fine. Transformers library works fine. LM Studio works fine.
Unless you believe Nvidia having a monopoly on inference or training AI models is good for the world, you can't oppose all the other GPU makers having a way for their chips to be used for those purposes. CUDA is a proprietary vendor-specific solution.
Edit: But, also, Vulkan works fine on the Strix Halo. It is reliable and usually not that much slower than ROCm (and occasionally faster, somehow). Here's some benchmarks: https://kyuz0.github.io/amd-strix-halo-toolboxes/
The problem with ROCm, unlike CUDA, is that it doesn’t run on much of AMDs own hardware, most notably their iGPU.
Why? What is the point of focusing on something that seems to be a memory management solution when the memory management problem theoretically just went away?
That has been one of the big themes in GPU hardware since around 2010 era when AMD committed to ATI. Nvidia tried to solve the memory management problem in the software layer, AMD committed to doing it in hardware. Software was a better bet by around a trillion dollars so far, but if the hardware solutions have finally come to fruit then why the focus on ROCm?