Earlier this week I started testing Chinese models on my codebase. I haven’t really looked at intera...

smoe • today at 8:27 PM • 4 replies • view on HN

Earlier this week I started testing Chinese models on my codebase. I haven’t really looked at interactive coding yet, but more at issue triage, bug auto-fixing, log analytics, etc.

I used DeepSeek, Kimi, GLM, Qwen, and MiMO against GPT-5.5 high as reference, all running in Pi harness without anything installed.

So far, Kimi and MiMO look the most promising to me. I haven’t tested them rigorously enough to make a strong statement, but my first impression is that, in practice, all those models may be less behind on typical daily tasks than people think.

They are a bit “work hard, not smart". Getting to same-ish results more slowly and using more tokens, but at a fraction of the price

Replies

_under_scores_ • today at 10:37 PM

I switched to predomentantly using mimo this week, mostly out of curiosity to see how dependant I was on frontier models. Honestly I cant really tell the difference. I would say I work on pretty average codebases with well know frameworks doing pretty typical things and initial impressions is that mimo, kimi and deepseek can probably handle what I need more or less the same as gpt5.5 or claude.

c0rruptbytes • today at 8:58 PM

I personally really like DS4 Flash - it's the largest I can run locally with decent speeds and I feel like it's good enough to maintain a codebase with less effort

maxdo • today at 9:28 PM

maybe i need to give it second chance, surprisingly Kimi 2.6 consistently fail even to generate valid json plan, where gemma 4 was doing really good, but slow.

alt Hacker News

Replies