HeadlinesBriefing favicon HeadlinesBriefing.com

Snapchat's Billion Predictions Per Second Engine

ByteByteGo •
×

Snapchat's Bento platform handles the immense challenge of processing 1 billion predictions per second across 474 million daily active users. The system must make four critical decisions in milliseconds: content recommendations, ad rankings, friend suggestions, and AR lens selection, each impacting revenue and engagement directly.

The architecture addresses asymmetric workloads through a two-stage approach. First, cheap retrieval models filter millions of candidates down to hundreds. Then, expensive ranking models evaluate these candidates. Snap reports processing 1 TB of feature reads and 10 trillion events daily, creating unique pressures on latency, scale, freshness, and iteration.

Bento's design splits cleanly into training and serving components. The training side uses a layered approach with Core framework, user code, and configuration files. The serving side optimizes by splitting compute graphs between GPU and CPU, enabling efficient resource use. This design allows hundreds of experiments per month while maintaining real-time responsiveness across a global user base.