I find that if I ask an LLM to explain what its reasoning was, it comes up with some post-hoc justif...

meatmanek • yesterday at 10:51 PM • 2 replies • view on HN

I find that if I ask an LLM to explain what its reasoning was, it comes up with some post-hoc justification that has nothing to do with what it was actually thinking. Most likely token predictor, etc etc.

As far as I understand, any reasoning tokens for previous answers are generally not kept in the context for follow-up questions, so the model can't even really introspect on its previous chain of thought.

Replies

wvenable • yesterday at 11:01 PM

I mostly find it useful for learning myself or for questioning a strange result. It usually works well for either of those. As you said, I'm probably not getting it's actual reasoning from any reasoning tokens but never thought that was happening anyway. It's just a way of interrogating the current situation in the current context.

It providing a different result is exactly because it's now looking at the existing solution and generating from there.

redman25 • today at 2:39 AM

It depends on the harness and/or inference engine whether they keep the reasoning of past messages.

Not to get all philosophical but maybe justification is post-hoc even for humans.

alt Hacker News

Replies