logoalt Hacker News

sleepybrettyesterday at 7:32 PM1 replyview on HN

or you can just load up ollama, have it load a local model and point claude or opencode at it...

is this article old? It's not. I'm not sure why he went through all the bother of llama.cpp


Replies

malkostayesterday at 7:35 PM

That was exactly my same question. Then I finished reading the post. The reason is pretty clear, and written in the post: it is faster than ollama+mlx.

show 1 reply