Yes this is not local first, the name is bad.
It absolutely can be pointed to any standard endpoint, either cloud or local.
It’s far better for most users to be able to specify an inference server (even on localhost in some cases) because the ecosystem of specialized inference servers and models is a constantly evolving target.
If you write this kind of software, you will not only be reinventing the wheel but also probably disadvantaging your users if you try to integrate your own inference engine instead of focusing on your agentic tooling. Ollama, vllm, hugging face, and others are devoting their focus to the servers, there is no reason to sacrifice the front end tooling effort to duplicate their work.
Besides that, most users will not be able to run the better models on their daily driver, and will have a separate machine for inference or be running inference in private or rented cloud, or even over public API.
To be precise, it’s exactly as local first as OpenClaw (i.e. probably not unless you have an unusually powerful GPU).
Horrible. Just because you have code that runs not in a browser doesn't mean you have something that's local. This goes double when the code requires API calls. Your net goes down and this stuff does nothing.