like someone said above: brew install llama.cpp llama-server -hf ggml-org/gemma-4-E4B-it-GGUF...

homarp • today at 7:58 AM • 0 replies • view on HN

like someone said above: brew install llama.cpp

llama-server -hf ggml-org/gemma-4-E4B-it-GGUF --port 8000 (with MCP support and web chat interface)

and you have OpenAI API on the same 8000 port. (https://github.com/ggml-org/llama.cpp/tree/master/tools/serv... lists the endpoints)

alt Hacker News