Demand-Driven Context (DDC) is a TDD-inspired methodology for building enterprise knowledge bases: give an AI agent a real problem, let it fail, then curate only the minimum knowledge needed for it to succeed. Nine cycles produced 46 reusable entities for a retail SRE agent.
Reconstructed from Figure 5, Navakoti & Navakoti (2026). New entity creation declines and reuse rises as the knowledge base approaches convergence.
DDC occupies the human-curated × domain knowledge quadrant — the only approach combining all five core components (Figure 1, Navakoti & Navakoti, 2026).
Give a large language model a problem in software architecture, medicine, or law, and it performs remarkably well. Give it a problem specific to your enterprise — one involving your internal systems, your domain-specific terminology, your operational workflows — and it performs like a new employee on their first day: expert reasoning, zero institutional memory. This is not a model capability problem. It is a knowledge architecture problem.
Two existing strategies try to fix this. Top-down knowledge engineering attempts to document everything before agents use it — but in large enterprises, knowing what knowledge exists and where it resides is itself a hard problem. These efforts produce bloated, untested knowledge bases full of information nobody ever needed. Bottom-up automation (systems like Reflexion, ExpeL, and ACE) lets agents learn from their own failures — but these approaches optimize execution strategy, not domain knowledge. An agent with excellent reasoning strategies still cannot answer "Why does the nightly batch job retry three times before escalating to the on-call team?" without someone providing that context.
Demand-Driven Context (DDC) inverts the knowledge engineering process. Borrowed from Test-Driven Development, the core insight is: just as TDD writes a failing test before writing code, DDC gives an agent a failing problem before curating context. The agent's failure tells you precisely what domain knowledge is missing. A human domain expert then curates only the minimum context needed for the agent to succeed. Nothing more.
Each DDC cycle runs through nine steps: a real problem arrives; the agent attempts it with its current (possibly empty) knowledge base and fails; the agent generates an information checklist — a structured list of what it needs; a domain expert provides targeted answers for each checklist item; the agent re-attempts with the new context; if the human validates the output, the knowledge is graduated to the permanent knowledge base as typed entities with YAML frontmatter, explicit relationships, and defined metadata. If the human rejects the output, the correction loop repeats. Everything is logged for convergence analysis.
The methodology defines a typed entity meta-model covering systems, capabilities, processes, data models, business jargon, technical jargon, constraints, and failure modes. Entities declare their relationships to each other — the resulting graph is traversable at runtime by future agents, compressing onboarding and eliminating the correction cascades that currently burden senior engineers.
The authors demonstrate DDC in a retail order fulfillment domain, targeting an SRE incident management agent. Nine cycles — each triggered by a realistic incident scenario — produced 46 knowledge entities covering the core systems, processes, jargon, and failure modes of the domain. Cycle 1, where the agent had zero context, created 8 entities just by failing on a single service-order queue contention incident. By Cycle 6, the agent required multiple rejected attempts before succeeding on a cross-region deployment error — because the root cause was a corrupted configuration data problem rather than an application logic failure. Each rejected attempt was logged and contributed to a richer understanding of what the agent had gotten wrong.
The convergence pattern is visible in Figure 5 of the paper: new entity creation drops across cycles while entity reuse rises. The convergence hypothesis suggests that after 20–30 cycles for a given domain role, the knowledge base stabilizes enough that each new problem reuses more than it creates. At that point, the KB has achieved sufficient coverage for the role's core scenarios.
Crucially, DDC and automated approaches like ACE are complementary, not competing. ACE optimizes how an agent behaves (execution strategies, tool-use patterns). DDC curates what an enterprise knows (system descriptions, business rules, architectural decisions). An agent could use ACE-style playbooks for its reasoning strategy while drawing on a DDC-built knowledge base for domain context.
Write a failing test first. In DDC: give the agent a real problem it cannot yet solve. Failure is a feature — it reveals exactly what's missing.
Make the test pass. In DDC: curate only the minimum knowledge needed to get the agent to a validated answer. No premature generalization.
Clean up and commit. In DDC: graduate validated entities to the permanent knowledge base as typed, versioned markdown files with YAML metadata.
Zero context. Thousands of service orders stuck in "Ready to Assign." Agent produced generic advice. After 1 cycle: 8 new entities covering systems, message flow, and domain jargon.
Three attempts before acceptance. Agent initially fabricated a non-existent "data replication pipeline." Root cause was a deployment script run against the wrong regional environment. 7 new entities including four-eyes-principle.
New entity count drops. Existing entities from prior cycles are reused rather than re-created. The knowledge base is approaching the coverage threshold for the SRE incident management role.
Navakoti, R., & Navakoti, S. (2026). Demand-Driven Context: A Methodology for Building Enterprise Knowledge Bases Through Agent Failure. arXiv:2603.14057v1 [cs.AI]. https://arxiv.org/abs/2603.14057