Can anyone give any tips for getting something that runs fairly fast under ollama? It doesn't h...

Myrmornis • today at 4:47 AM • 2 replies • view on HN

Can anyone give any tips for getting something that runs fairly fast under ollama? It doesn't have to be very intelligent.

When I tried gpt-oss and qwen using ollama on an M2 Mac the main problem was that they were extremely slow. But I did have a need for a free local model.

Replies

parthsareen • today at 5:14 AM

How much ram are you running with? Qwen3 and gpt-oss:20b punch a good bit above their weight. Personally use it for small agents.

am17an • today at 4:56 AM

Use llama.cpp? I get 250 toks/sec on gpt-oss using a 4090, not sure about the mac speeds

alt Hacker News

Replies