HeadlinesBriefing favicon HeadlinesBriefing.com

Latent Agents Condense Multi-Agent Debate into One Model

Hacker News •
×

Researchers from University of Cambridge introduced a new technique called Latent Agents that condenses multi‑agent debate into a single large language model. The method replaces lengthy debate transcripts with a two‑stage fine‑tuning process that learns debate structure and internalizes it through reward scheduling and length clipping. This approach cuts token usage by up to 93 %.

Across several benchmarks, the distilled models matched or outperformed explicit multi‑agent debate while using far fewer tokens. Analysis of activation patterns revealed agent‑specific subspaces—interpretable directions that mirror distinct debate perspectives. These subspaces suggest that the model internally simulates separate agents without generating separate transcripts.

The team also tested malicious agent injection by embedding harming personas into the LLM via internalized debate, then applying negative steering to suppress them. Results showed the distilled model localizes harmful behavior more cleanly, requiring smaller performance drops compared to steering base models. This demonstrates a practical path to safer reasoning in large language systems.

By distilling debate into a single model, Latent Agents offers a scalable alternative to compute‑heavy multi‑agent setups. It preserves reasoning quality, reduces token costs, and provides a clearer window into internal decision spaces—valuable for both researchers and practitioners seeking efficient, controllable AI.