HeadlinesBriefing favicon HeadlinesBriefing.com

OpenAI adds context‑aware safety to ChatGPT

OpenAI Blog •
×

OpenAI rolled out a suite of safety upgrades that let ChatGPT spot emerging risk cues across a dialogue. By stitching together subtle signals—like early signs of distress—the model can shift from a routine reply to de‑escalation, refusal, or referral to crisis resources. The changes target rare but high‑stakes scenarios such as suicide, self‑harm, and harm‑to‑others.

The upgrades build on two years of collaboration with the Global Physicians Network, where psychiatrists and forensic psychologists helped define when to generate “safety summaries.” These short, factual notes capture prior risk‑relevant context for a limited time and feed it back into the model. Internal tests on GPT‑5.5 Instant showed safe‑response rates jump 50% for suicide/self‑harm prompts and 16% for harm‑to‑others.

Evaluations of more than 4,000 safety summaries yielded an average relevance score of 4.93 out of 5 and a factuality rating of 4.34, confirming that the added context does not degrade ordinary conversation quality. OpenAI’s approach demonstrates that targeted, time‑bound memory can improve risk detection without compromising the user experience.

Future work will explore applying the same summarization technique to other high‑risk domains such as biosecurity and cyber threats, always with strict safeguards. By keeping the safety context fleeting and narrowly scoped, OpenAI aims to balance protective oversight with the conversational fluidity users expect.