HeadlinesBriefing favicon HeadlinesBriefing.com

Context Engineering for RAG: How Four Typed Inputs Power Every LLM Answer

Towards Data Science •
×

The practice of context engineering for RAG systems now has a name, thanks to Tobi Lütke and Andrej Karpathy who coined the term in mid-2025. What teams were already building in production systems—structured pipelines feeding LLMs with carefully assembled context—finally received proper terminology. The shift reframes prompt engineering as just one component within broader context assembly.

For single-document RAG workflows, four architectural bricks handle the work: document parsing, question parsing, retrieval, and generation. Each emits typed outputs rather than raw text. Document parsing produces relational tables with metadata. Question parsing generates structured queries with intent labels and shape expectations. Retrieval filters results with audit trails. Generation consumes these typed pieces to produce Pydantic answers with citations.

These typed channels converge into one LLM call with a fixed system prompt and assembled user content. The approach treats context assembly as software architecture—complete with typed objects, component contracts, and caching strategies. Code examples in the article demonstrate this pattern using Pydantic models and DataFrames, showing how production teams actually structure their RAG implementations.

The Enterprise Document Intelligence series builds this architecture from the ground up. While corpus-level and conversational extensions remain future work, the core insight stands: naming the practice validates what successful teams already do, making it easier to discuss and improve upon real-world implementations.