ROCm has improved but the reality is you're still fighting the driver stack more than the model...

warwickmcintosh • yesterday at 10:42 PM • 1 reply • view on HN

ROCm has improved but the reality is you're still fighting the driver stack more than the models. If you're actually doing local inference on AMD you're spending your time on CUDA compatibility layers, not the AI part. Two lines of python is marketing, the gap between demo and working AMD setup is still real.

Replies

ddtaylor • yesterday at 11:18 PM

Ollama works very well in Linux on my AMD hardware. I have a 6800 XT which isn't even originally supported by the ROCm stack in some ways and it "just works" for a ton of very nice models, especially if I seek out quantized versions of the model.

alt Hacker News

Replies