Personally, when I use open code or routers, I feel that beyond a certain level, the models don'...

jdw64 • today at 1:01 PM • 4 replies • view on HN

Personally, when I use open code or routers, I feel that beyond a certain level, the models don't make a huge difference to me. Except for expensive and mediocre models like Gemini. In that sense, Chinese models are pretty good. I usually write code in function or method units and then design and assemble them together.

GPT series models are more thorough and better, but I'm not sure if the difference is enormous. It seems to depend on the workflow, but in my opinion, if you are thorough enough, I wonder if there really is a big difference

Replies

sjanes • today at 2:39 PM

I've kind of given up on the routers for "free" inference, as you would expect, they tend to give you sub-par thinking because they are obviously trying to conserve as much inference as possible.

I've had some success turning my macbook M1 pro into a heating pad with Qwen 3.6 35B A3B MTP. Trying to use Gemini models "locally" resulted in a similar "short shrift" of effort resulting in mistakes and lots of turns. The reports of Fable being relentlessly "proactive" shows you can go the other direction as well, if you have strong enough branding and effective invoicing.

➕ show 3 replies

onlyrealcuzzo • today at 1:06 PM

In my experience, there's little difference between implementing individual functions between frontier models and SotA ~30B param models.

Once you have a coherent design (the hard part), you can feed it to a pretty small model and get basically the same quality.

They'll not one-shot, but they're faster and cheaper, so it still works out in your favor.

Plus you can do it locally...

➕ show 1 reply

regularfry • today at 2:41 PM

The difference in outcome isn't that big but yes, you need to be more rigorous. For instance I've found that the Kimi K2.5 and K2.6 models will comment out failing tests rather than fix a problem they just caused (mistaking them for "pre-existing failures"), so you need to specifically make commented-out tests break the build. I've not personally had that problem with any of the Anthropic or OpenAI models.

➕ show 1 reply

dcreater • today at 1:28 PM

I really hope we stop using the term "Chinese models". It has this air of Negative connotation. It's the equivalent of calling cars Japanese, which people used to do but now is almost entirely meaningless. You just call them Toyota, Honda, Lexus etc.

➕ show 9 replies

alt Hacker News

Replies