This is a weakness of docker, a bit, I think.
I was rigging this up, myself, and conciscious of the fact that basic docker is "all or none" for container port forwarding because it's for presenting network services, had to dig around with iptables so it'd be similar to binding on localhost.
The use case https://github.com/meltyness/tax-pal
The ollama container is fairly easy to deploy, and supports GPU inference through container toolkit. I'd imagine many of these are docker containers.
e: i stand corrected, apparently -p of `docker run` can have a binding interface stipulated
e2: https://docs.docker.com/engine/containers/run/#exposed-ports which is not in some docs
e3: but it's in the man page ofc
Out of curiosity, why would you need to wrap the call to an Ollama modelfile in docker? Does the dockerized ollama client provide some benefit, when it’s shelling down to local Ollama instance anyway? (Wrt tax-pal)
Yes the binding interface can be specified, but the default for -p 11434:11434 is 0.0.0.0.
IMO the default should be 127.0.0.1 and the user should have to explicitly bind to all via -p 0.0.0.0:11434:11434.