logoalt Hacker News

warwickmcintoshyesterday at 10:42 PM1 replyview on HN

ROCm has improved but the reality is you're still fighting the driver stack more than the models. If you're actually doing local inference on AMD you're spending your time on CUDA compatibility layers, not the AI part. Two lines of python is marketing, the gap between demo and working AMD setup is still real.


Replies

ddtayloryesterday at 11:18 PM

Ollama works very well in Linux on my AMD hardware. I have a 6800 XT which isn't even originally supported by the ROCm stack in some ways and it "just works" for a ton of very nice models, especially if I seek out quantized versions of the model.