> As a chatbot, it's the only one that seems to really relish calling you out on mistakes or nonsense, and it doesn't hesitate to be blunt with you.
My experience is that Sonnet 4.5 does this a lot as well, but this is more often than not due to a lack of full context, eg accusing the user of not doing X or Y when it just wasn’t told that was already done, and proceeding to apologize.
How is Kimi K2 in this regard?
Isn’t “instruction following” the most important thing you’d want out of a model in general, and a model pushing back more likely than not being wrong?
Only if you're really, really good at constructing precise instructions, at which point you don't really need a coding agent.
> Isn’t “instruction following” the most important thing you’d want out of a model in general,
No. And for the same reason that pure "instruction following" in humans is considered a form of protest/sabotage.
https://en.wikipedia.org/wiki/Work-to-rule