I don't trust it completely but I still use it. Trust but verify.
I've had some funny conversations -- Me:"Why did you choose to do X to solve the problem?" ... It:"Oh I should totally not have done that, I'll do Y instead".
But it's far from being so unreliable that it's not useful.
> Trust but verify.
I guess I should have used ‘completely trust’ instead of ‘trust’ in my original comment. I was referring to the subset of developers who call themselves vibe coders.
I find that if I ask an LLM to explain what its reasoning was, it comes up with some post-hoc justification that has nothing to do with what it was actually thinking. Most likely token predictor, etc etc.
As far as I understand, any reasoning tokens for previous answers are generally not kept in the context for follow-up questions, so the model can't even really introspect on its previous chain of thought.