HeadlinesBriefing favicon HeadlinesBriefing.com

LinkedIn Rewrites Feed with a Single LLM Retrieval Engine

ByteByteGo •
×

LinkedIn's feed engineering team ripped out five disparate ranking subsystems and replaced them with a single LLM‑powered retrieval model that serves 1.3 billion users. The new dual‑encoder turns both member profiles and post content into vectors, then runs a nearest‑neighbor search to surface candidates in under 50 ms. Semantic embeddings let the system infer interests for brand‑new members from just a headline, eliminating the cold‑start lag. This consolidation removed cross‑system interference that previously made holistic improvements impossible.

To feed structured profile data into the LLM, engineers built a prompt library that converts fields such as skills, work history and engagement counts into templated text. Raw numeric values proved useless—“views:12345” produced near‑zero correlation with relevance. By bucketing counts into percentile tokens, the model learned magnitude, boosting popularity signal correlation thirty‑fold and lifting Recall@10 by 15 %. These changes also improved relevance for niche industries.

Training also shifted to include only positively‑engaged posts, cutting sequence length and slashing GPU memory by 37 %. Faster batches allowed 40 % more examples per step and a 2.6× speedup in iteration time. The team paired easy negatives—random unseen posts—with hard negatives—shown but ignored—to sharpen contrastive learning, delivering a cleaner, more efficient recommendation pipeline. Overall latency stayed within the strict service‑level agreement.