logoalt Hacker News

sschuellertoday at 6:51 AM2 repliesview on HN

Try Kilocode with deepseek v4 (via API directly to deepseek, much cheaper than via kilo).

I have had very good results and compared to others it just costs pennies.

I use something similar to this https://github.com/ScotterMonk/AgentAutoFlow setup and switch between deepseek v4 to flash depending on task.


Replies

sfifstoday at 11:09 AM

Deepseek Flash v4 actually runs on 128Gb systems (about 14 tok/sec). Antirez created a fabulous 2 bit quant and a highly tuned LLM server

https://github.com/antirez/ds4

LoganDarktoday at 7:05 AM

I do use DeepSeek, it's exceptionally cheap! Inference is slow though, and it's not particularly intelligent but the experience is better than local inference.