Try Kilocode with deepseek v4 (via API directly to deepseek, much cheaper than via kilo).
I have had very good results and compared to others it just costs pennies.
I use something similar to this https://github.com/ScotterMonk/AgentAutoFlow setup and switch between deepseek v4 to flash depending on task.
I do use DeepSeek, it's exceptionally cheap! Inference is slow though, and it's not particularly intelligent but the experience is better than local inference.
Deepseek Flash v4 actually runs on 128Gb systems (about 14 tok/sec). Antirez created a fabulous 2 bit quant and a highly tuned LLM server
https://github.com/antirez/ds4