I do have to wonder what the mix is between "our data show this is how most people want to be talked to" and "these tokens lead to better responses on objective measures of correctness." That is, in the training data insightful questions are tangled with insightful answers, so if the bot basically always treats the user like a genius it gets on the track that leads to better answers.
Or yeah, it's just people being weak to flattery.