HeadlinesBriefing favicon HeadlinesBriefing.com

OpenAI Releases Open-Source Privacy Filter for PII Detection

OpenAI Blog •
×

OpenAI released Privacy Filter today, an open-weight model designed to detect and redact personally identifiable information in text. The 1.5 billion parameter model can run locally, allowing PII to be masked without ever leaving a user's machine. It's part of OpenAI's broader push to provide developers with practical infrastructure for building privacy protections into AI systems from the ground up.

The model identifies eight privacy categories including personal identifiers, contact details, addresses, dates, account numbers, and secrets like passwords and API keys. Its bidirectional token-classification architecture processes all tokens in a single forward pass, supporting context windows up to 128,000 tokens. This makes it efficient for high-throughput privacy workflows while maintaining the context awareness needed to distinguish between public information and private data that requires masking.

Privacy Filter achieves state-of-the-art performance on the PII-Masking-300k benchmark with a 96% F1 score (94.04% precision, 98.04% recall). The model is available on Hugging Face and GitHub under the Apache 2.0 license for experimentation, customization, and commercial deployment. OpenAI cautions that the filter is not a compliance certification or anonymization tool—human review remains essential for high-stakes domains like legal, medical, and financial workflows.