logoalt Hacker News

boppo1yesterday at 6:57 PM1 replyview on HN

I've been way out of the local game for a while now, what's the best way to run models for a fairly technical user? I was using llama.cpp in the command line before and using bash files for prompts.


Replies

adrian_byesterday at 10:17 PM

Running llama-server (it belongs to llama.cpp) starts a HTTP server on a specified port.

You can connect to that port with any browser, for chat.

Or you can connect to that port with any application that supports the OpenAI API, e.g. a coding assistant harness.