BoltAI also does this, but a CLI tool is nice.
It’s a nice LLM because it seems fairly decent and it loads instantly and uses the CPU neural engine. The GPU is faster but when I run bigger LLMs on the GPU the normally very cool M series Mac becomes a lap roaster.
It’s a small LLM though. Seems decent but it’s also been safety trained to a somewhat comical degree. It will balk over safety at requests that are in fact quite banal.