I looked into running local models last month. They just aren't quite there for agentic tool use workflows without spending a small fortune. I'm hopeful smaller local models get much better soon.
I was also hoping to put out another post on my homelab setup, it has some neat stuff, but I haven't had a chance to finish it.
I think it heavily depends on what you're asking the model to do. Qwen3.6, both 27B and 35B-A3B, do agentic tool use very well. Their decision making is sus, but the dense model is decent in that way. A 4-bit quant for either of those can run on many home systems with a bit of configuration.
The biggest issue I've noticed is that the chat templates for open models are really hit or miss. The default Qwen3.6 chat template mostly works these days, but depending on your workload it may cause major issues. There are plenty of "fixed" chat templates on hugging face, but people report mixed success. It really seems to depend a lot on what the tool you're using expects.