logoalt Hacker News

nijavetoday at 12:31 PM5 repliesview on HN

Anyone compare to ollama? I had good success with latest ollama with ROCm 7.4 on 9070 XT a few days ago


Replies

martin-adamstoday at 3:52 PM

I just compared this on my Mac book M1 Max 64GB RAM with the following:

Model: qwen3.59b Prompt: "Hey, tell me a story about going to space"

Ollama completed in about 1:44 Lemonade completed in about 1:14

So it seems faster in this very limited test.

nezhartoday at 6:00 PM

I'm also curious about this one, also I want to compare this to vLLM.

RealFloridaMantoday at 2:44 PM

It is optimized for compatibility across different APIs as well as has specific hardware builds for AMD GPUs and NPUs. It’s run by AMD.

Under the hood they are both running llama.cpp, but this has specific builds for different GPUs. Not sure if the 9070 is one, I am running it on a 370 and 395 APU.

iugtmkbdfil834today at 12:35 PM

Seconded. Currently on ollama for local inference, but I am curious how it compares.

show 1 reply
metalliqaztoday at 2:48 PM

better than Vulkan?

show 3 replies