logoalt Hacker News

redrovetoday at 10:25 AM8 repliesview on HN

There is virtually no reason to use Ollama over LM Studio or the myriad of other alternatives.

Ollama is slower and they started out as a shameless llama.cpp ripoff without giving credit and now they “ported” it to Go which means they’re just vibe code translating llama.cpp, bugs included.


Replies

alifeinbinarytoday at 10:58 AM

I really like LM Studio when I can use it under Windows but for people like me with Intel Macs + AMD gpu ollama is the only option because it can leverage the gpu using MoltenVK aka Vulkan, unofficially. We're still testing it, hoping to get the Vulkan support in the main branch soon. It works perfectly for single GPUs but some edge cases when using multiple GPUs are unsupported until upstream support from MoltenVK comes through. But yeah, I agree, it wasn't cool to repackage Georgi's work like that.

gen6acd60aftoday at 11:32 AM

LM Studio is closed source.

And didn't Ollama independently ship a vision pipeline for some multimodal models months before llama.cpp supported it?

show 1 reply
jrm4today at 2:57 PM

Do y'all mean backend or the Ollama frontend or both? I find it trivially easy to sub in my local Ollama api thing in virtually all of the interesting frontend things. I'm quite curious about the "why not Ollama" here.

faitswulfftoday at 11:56 AM

Does LM Studio have an equivalent to the ollama launch command? i.e. `ollama launch claude --model qwen3.5:35b-a3b-coding-nvfp4`

show 1 reply
meltynesstoday at 11:30 AM

I feel like the READMEs for these 3 large popular packages already illustrate tradeoffs better than hacker news argument

iLoveOncalltoday at 10:35 AM

> There is virtually no reason to use Ollama over LM Studio or the myriad of other alternatives.

Hmm, the fact that Ollama is open-source, can run in Docker, etc.?

show 1 reply
louskentoday at 11:13 AM

lm studio is not opensource and you can't use it on the server and connect clients to it?

show 1 reply
logicalleetoday at 1:40 PM

>Ollama is slower

I've benchmarked this on an actual Mac Mini M4 with 24 GB of RAM, and averaged 24.4 t/s on Ollama and 19.45 t/s on LM Studio for the same ~10 GB model (gemma4:e4b), a difference which was repeated across three runs and with both models warmed up beforehand. Unless there is an error in my methodology, which is easy to repeat[1], it means Ollama is a full 25% faster. That's an enormous difference. Try it for yourself before making such claims.

[1] script at: https://pastebin.com/EwcRqLUm but it warms up both and keeps them in memory, so you'll want to close almost all other applications first. Install both ollama and LM Studio and download the models, change the path to where you installed the model. Interestingly I had to go through 3 different AI's to write this script: ChatGPT (on which I'm a Pro subscriber) thought about doing so then returned nothing (shenanigans since I was benchmarking a competitor?), I had run out of my weekly session limit on Pro Max 20x credits on Claude (wonder why I need a local coding agent!) and then Google rose to the challenge and wrote the benchmark for me. I didn't try writing a benchmark like this locally, I'll try that next and report back.

show 1 reply