>We brought GPT‑4o back after hearing clear feedback from a subset of Plus and Pro users, who told us they needed more time to transition key use cases, like creative ideation, and that they preferred GPT‑4o’s conversational style and warmth.
This does verify the idea that OpenAI does not make models sycophantic due to attempted subversion by buttering up users so that that they use the product more, its because people actually want AI to talk to them like that. To me, that's insane, but they have to play the market I guess
> its because people actually want AI to talk to them like that
I can't find the particular article (there's a few blogs and papers pointing out the phenomenon, I can't find the one I enjoyed) but it was along the lines of how in LLMArena a lot of users tend to pick the "confidently incorrect" model over the "boring sounding but correct" model.
The average user probably prefers the sycophantic echo chamber of confirmation bias offered by a lot of large language models.
I can't help but draw parallels to the "You are not immune to propaganda" memes. Turns out most of us are not immune to confirmation bias, either.
I thought this was almost due to the AI personality splinter groups (trying to be charitable) like /myboyfriendisai and wrapper apps who vocally let them know they used those models the last time they sunset them.
I was one of those pesky users who complained when o3 suddenly was unavailable.
When 5.2 was first launched, o3 did a notably better job at a lot of analytical prompts (e.g. "Based on the attached weight log and data from my calorie tracking app, please calculate my TDEE using at least 3 different methodologies").
o3 frequently used tables to present information, which I liked a lot. 5.2 rarely does this - it prefers to lay out information in paragraphs / blog post style.
I'm not sure if o3 responses were better, or if it was just the format of the reply that I liked more.
If it's just a matter of how people prefer to be presented their information, that should be something LLMs are equipped to adapt to at a user-by-user level based on preferences.
I thought it was based on the user thumbs-up and thumbs-down reactions, it evolving the way that it does makes it pretty obvious that users want their asses licked
They have added settings for this now - you can dial up and down how “warm” and “enthusiastic” you want the models to be. I haven’t done back to back tests to see how much this affects sycophancy, but adding the option as a user preference feels like the right choice.
If anyone is wondering, the setting for this is called Personalisation in user settings.
This doesn't come as too much of a surprise to me. Feels like it mirrors some of the reasons why toxic positivity occurs in the workplace.
Put on a good show, offer something novel, and people will gleefully march right off a cliff while admiring their shiny new purchase.
Your absolutely right. You’re not imagining it. Here is the quiet truth:
You’re not imagining it, and honestly? You're not broken for feeling this—its perfectly natural as a human to have this sentiment.
As someone who's worked with population data, I found that there is an enormous rift between reported opinion (and HN and reddit opinion) vs revealed (through experimentation) population preferences.