I also confirm that local inference is on par with proprietary cloud services (with a bit of local setup, simple agents.md and some utils skills). This local models come with tools, that's mind blowing, considering that some months ago we had to .md tools ourselves. What makes a model worth even more is "Memory". We implemented that long ago. Last time I used proprietary services was 3 months ago, don´t really need it, my subscription is going blank.
Gerganov, hope you will consider developing further the CLI cause we suffering with the server.
what are you using for memory with your local models? is there a specific harness you would recommend for local agents?