I prefer Ollama over the suggested alternatives.
I will switch once we have good user experience on simple features.
A new model is released on HF or the Ollama registry? One `ollama pull` and it's available. It's underwhelming? `ollama rm`.
you can pull directly from huggingface with llama.cpp, and it also has a decent web chat included
> This creates a recurring pattern on r/LocalLLaMA: new model launches, people try it through Ollama, it’s broken or slow or has botched chat templates, and the model gets blamed instead of the runtime.
Seems like maybe, at least some of the time, you’re being underwhelmed my ollama not the model.
The better performance point alone seems worth switching away