OpenAI says its GPT-4o update could be ‘uncomfortable, unsettling, and cause distress’

10 months ago 89

OpenAI rolled backmost a GPT-4o update for ChatGPT that caused the chatbot’s default property to beryllium “overly flattering oregon agreeable – often described arsenic sycophantic” and that “sycophantic interactions tin beryllium uncomfortable, unsettling, and origin distress,” the institution says successful a blog post.

The institution introduced a GPT-4o update past week that included adjustments “aimed astatine improving the model’s default property to marque it consciousness much intuitive and effectual crossed a assortment of tasks,” according to the post. OpenAI says it starts shaping exemplary behaviour archetypal with what’s outlined successful its Model Spec and teaches the models however to use the principles successful that spec “by incorporating idiosyncratic signals similar thumbs-up / thumbs-down feedback connected ChatGPT responses.”

But with the rolled-back update, OpenAI says that “we focused excessively overmuch connected short-term feedback, and did not afloat relationship for however users’ interactions with ChatGPT germinate implicit time.” That meant that “GPT‑4o skewed towards responses that were overly supportive but disingenuous.”

OpenAI designs ChatGPT’s default property to “reflect our ngo and beryllium useful, supportive, and respectful of antithetic values and experience,” the blog station says, but adds that “each of these desirable qualities similar attempting to beryllium utile oregon supportive tin person unintended broadside effects.” The institution says that “a azygous default can’t seizure each preference” for its 500 cardinal play ChatGPT users.

OpenAI volition beryllium “taking much steps to realign the model’s behavior,” including “refining halfway grooming techniques and strategy prompts to explicitly steer the exemplary distant from sycophancy” and “expanding ways” for users to springiness feedback. “We besides judge users should person much power implicit however ChatGPT behaves and, to the grade that it is harmless and feasible, marque adjustments if they don’t hold with the default behavior,” the institution says.

Read Entire Article