Right, and then look at any number of research papers showing that CoT output has limited impact on the end result. We've trained these models to pretend to reason.
> Right, and then look at any number of research papers showing that CoT output has limited impact on the end result.
Which research papers? Do I have to find them?
> We've trained these models to pretend to reason.
I have no idea why that matters. Can you tell me what the difference is if it looks exactly the same and has the same result?
If it's only pretending to reason, then how is it that the CoT output improves performance on every single benchmark/test?