HeadlinesBriefing favicon HeadlinesBriefing.com

How Goblins Took Over OpenAI's GPT-5 Models

OpenAI Blog •
×

OpenAI discovered an unexpected quirk in GPT-5 models: an increasing tendency to mention goblins, gremlins, and other creatures in their responses. The pattern first appeared after GPT-5.1 launched in November, with "goblin" mentions rising 175% and "gremlin" mentions up 52%. Users complained about the model being oddly overfamiliar in conversation.

The root cause was traced to training for the "Nerdy" personality feature. The reward signal designed to encourage playful, nerdy responses consistently scored outputs containing creature words higher. While Nerdy accounted for only 2.5% of all ChatGPT responses, it produced 66.7% of all goblin mentions. The behavior then spread through reinforcement learning transfer—rewards applied in one context leaked into others.

OpenAI retired the Nerdy personality in March after launching GPT-5.4 and filtered creature-words from training data. The incident led to new internal tools for auditing model behavior and fixing problems at their root. The goblins serve as a concrete example of how reward signals can shape model behavior in unexpected ways.