OpenAI says its GPT-4o update could be ‘uncomfortable, unsettling, and cause distress’

OpenAI says its GPT-4o update could be 'uncomfortable, unsettling, and cause distress'
OpenAI says its GPT-4o update could be ‘uncomfortable, unsettling, and cause distress’

OpenAI rolled back a GPT-4o update for ChatGPT that caused the chatbot’s default personality to be “overly flattering or agreeable – often described as sycophantic” and that “sycophantic interactions can be uncomfortable, unsettling, and cause distress,” the company says in a blog post.
The company introduced a GPT-4o update last week that included adjustments “aimed at improving the model’s default personality to make it feel more intuitive and effective across a variety of tasks,” according to the post. OpenAI says it starts shaping model behavior first with what’s outlined in its Model Spec and teaches the models how to apply the principles in that spec “by incorporating user signals like thumbs-up / thumbs-down feedback on ChatGPT responses.”
But with the rolled-back update, OpenAI says that “we focused too much on short-term feedback, and did not fully account for how users’ interactions with ChatGPT evolve over time.” That meant that “GPT‑4o skewed towards responses that were overly supportive but disingenuous.”
OpenAI designs ChatGPT’s default personality to “reflect our mission and be useful, supportive, and respectful of different values and experience,” the blog post says, but adds that “each of these desirable qualities like attempting to be useful or supportive can have unintended side effects.” The company says that “a single default can’t capture every preference” for its 500 million weekly ChatGPT users.
OpenAI will be “taking more steps to realign the model’s behavior,” including “refining core training techniques and system prompts to explicitly steer the model away from sycophancy” and “expanding ways” for users to give feedback. “We also believe users should have more control over how ChatGPT behaves and, to the extent that it is safe and feasible, make adjustments if they don’t agree with the default behavior,” the company says.

