logoalt Hacker News

metalliqazyesterday at 3:38 PM2 repliesview on HN

Perhaps I should just google it, but I'm under the impression that ollama uses llama.cpp internally, not the other way around.

Thanks for that data point I should experiment with ROCm


Replies

cpburns2009yesterday at 4:15 PM

I meant ollama uses llama.cpp internally. Sorry for the confusion.

naaskingyesterday at 5:36 PM

From what I understand, ROCm is a lot buggier and has some performance regressions on a lot of GPUs in the 7.x series. Vulkan performance for LLMs is apparently not far behind ROCm and is far more stable and predictable at this time.