Exactly. The model is exquisitely sensitive to language. The idea that you would encourage it to think like a caveman to save a few tokens is hilarious but extremely counter-productive if you care about the quality of its reasoning.
Does this imply that if you train it on Gwern style output, the quality will improve?
Does this imply that if you train it on Gwern style output, the quality will improve?