Helping ChatGPT better recognize context in sensitive conversations
New safety updates improve ChatGPT’s ability to recognize and respond to subtle or evolving cues of distress or potential harm in user conversations. The system is now better equipped to provide crisis resources and de-escalate sensitive situations, thanks to enhanced contextual understanding within and across interactions.
ChatGPT has received new safety updates to improve its ability to recognize and respond to potential risks over time. These updates help the AI distinguish between routine interactions and rare cases where more caution is needed, allowing it to de-escalate, refuse harmful details, or redirect users to safer alternatives and crisis resources when appropriate.
One key improvement focuses on understanding context, as a seemingly ordinary request can carry different meaning when viewed alongside earlier signs of distress. ChatGPT is now trained to identify potential harmful intent from surrounding context, enabling it to refuse requests or guide users toward support. This capacity is crucial for acute scenarios like suicide, self-harm, and harm to others.
To address safety risks that might emerge across separate conversations, ChatGPT now uses "safety summaries." These brief, factual notes capture relevant safety context from previous interactions, helping the model connect signals and escalate caution when needed. These summaries are strictly scoped, temporary, and used only for serious safety concerns, not for personalization or long-term memory.
The development of these features involved extensive collaboration with mental health and safety experts. Their input guided decisions on when to create safety summaries, how much prior context is relevant, and how long the model should consider that context. This expertise ensures more appropriate responses in sensitive situations.
Internal evaluations show significant improvements in ChatGPT's safe responses. In long single-conversation scenarios, safe-response performance improved by 50% for suicide and self-harm cases, and by 16% for harm-to-others cases. Across multiple conversations, using GPT-5.5 Instant, safe-response performance improved by 52% in harm-to-others cases and 39% in suicide and self-harm cases. Safety summaries also demonstrated high relevance and factual accuracy in evaluations.
Related articles
Build real agentic apps using CUGA: two dozen working examples on a lightweight harness
CUGA, IBM's open-source Agent Harness, simplifies building agentic applications by handling infrastructure, allowing developers to focus on tools and prompts. It offers pre-assembled components for planning, execution, and state management, significantly reducing development time. CUGA has topped agent benchmarks like AppWorld and WebArena.
OpenAI launches new initiative to help find and patch open source bugs
OpenAI has launched "Patch the Planet," a new initiative in partnership with cybersecurity firm Trail of Bits, to enhance the security of open-source projects. This program aims to assist maintainers in identifying and patching bugs, utilizing OpenAI's AI-powered security tools while reducing the burden on project teams.
PP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M Parameters
Baidu has released PP-OCRv6, an advanced optical character recognition (OCR) model supporting 50 languages. Available on Hugging Face, this version significantly improves accuracy and efficiency across various parameter sizes, from 1.5 million to 34.5 million, marking a substantial leap in multilingual OCR technology.
