logoalt Hacker News

Caumtoday at 3:10 PM2 repliesview on HN

Been running local LLMs on my 7900 XTX for months and the ROCm experience has been... rough. The fact that AMD is backing an official inference server that handles the driver/dependency maze is huge. My biggest question is NPU support - has anyone actually gotten meaningful throughput from the Ryzen AI NPU vs just using the dGPU? In my testing the NPU was mostly a bottleneck for anything beyond tiny models.


Replies

lrvicktoday at 5:35 PM

I have had way better perf with Vulcan than ROCm on kernel 7.0.0. They made some major improvements. 20%+ speedups for me.

cl0ckt0wertoday at 3:11 PM

the npu is more for power efficiency when on battery. I don't think it's a replacement for gpu.