HeadlinesBriefing favicon HeadlinesBriefing.com

OpenAI Scales PostgreSQL for 800M ChatGPT Users

OpenAI News •
×

OpenAI's PostgreSQL database now handles queries for over 800 million ChatGPT users. The system has grown 10x in load, relying on a single primary Azure PostgreSQL flexible server and nearly 50 global read replicas. This architecture supports massive read-heavy workloads but faces challenges during high write periods.

The core limitation stems from PostgreSQL's MVCC implementation, which causes write amplification and table bloat. To mitigate this, OpenAI migrated shardable, write-heavy workloads to Azure Cosmos DB and banned new tables in the primary deployment. They optimized queries to avoid expensive multi-table joins and reduced primary load by offloading reads to replicas.

OpenAI prioritized query optimization, fixing bugs that caused redundant writes and setting strict rate limits for backfills. They configured timeouts like `idle_in_transaction_session_timeout` to prevent long-running queries from blocking maintenance. While the single-primary setup remains a potential failure point, offloading critical reads ensures service continuity even if the writer fails.