HeadlinesBriefing favicon HeadlinesBriefing.com

How LLM Reasoning Unlocks Hidden Factual Knowledge

Google AI Blog •
×

Large language models using chain-of-thought reasoning perform better on complex tasks, but Google AI researchers discovered this approach also helps with simple factual questions. When asked about Mary Engle Pennington's induction year into the National Inventors Hall of Fame, models can retrieve answers that would otherwise remain hidden in their parametric memory.

The team tested Gemini-2.5 (Flash and Pro) and Qwen3-32B models on single-hop questions, finding reasoning dramatically improved recall rates. They identified two mechanisms: a computational buffer effect where extra tokens provide processing time, and factual priming where related facts surface as semantic bridges. Even meaningless repeated text like 'Let me think' boosted performance by giving models more computational runway.

However, this self-retrieval mechanism has a critical weakness. When intermediate facts in reasoning traces contain hallucinations, models become significantly less likely to reach correct final answers. The researchers built an auditing pipeline to verify hundreds of thousands of reasoning steps, confirming that factual accuracy in intermediate stages directly impacts final performance.

These findings suggest practical improvements for model reliability. Generating multiple reasoning trajectories and selecting only those with verifiable intermediate facts yields better accuracy. Training with process rewards for factually supported steps could produce inherently more reliable models with reduced hallucination rates.