--- id: KONT-WP-0007 type: workplan title: "Governed Retrieval And Context Graph" domain: markitect repo: kontextual-engine status: todo owner: codex topic_slug: markitect planning_priority: high planning_order: 7 created: "2026-05-05" updated: "2026-05-05" state_hub_workstream_id: "64352515-9677-46bb-909a-9e2db4915dc7" --- # KONT-WP-0007: Governed Retrieval And Context Graph ## Purpose Build retrieval as a governed operational capability: stable query contracts, text search, metadata and lifecycle filtering, contextual entities, relationship traversal, source-grounded snippets, permission checks, and quality feedback. ## Requirement Coverage Primary: FR-040 to FR-050 and FR-060 to FR-071. Supporting: FR-120 to FR-126, FR-143 to FR-146, FR-163, FR-200 to FR-204. ## Architecture Constraint Implement retrieval through retrieval services, search ports, repository ports, and policy checks described in `docs/architecture-blueprint.md`. Search indexes and ranking backends are adapters; they must not define the stable query or result contracts. ## markitect-tool Boundary Remark For Markdown-backed assets, retrieval adapters may use Markitect selectors, extraction helpers, local index concepts, and context-package source spans to produce grounded units and snippets. Engine retrieval contracts, result envelopes, policy filtering, pagination, feedback, and cross-format search remain engine-owned. ## R7.1 - Implement query contracts pagination sorting and result envelopes ```task id: KONT-WP-0007-T001 status: todo priority: high state_hub_task_id: "5a1b0661-ce22-4ee6-a9e7-0aedce9d4356" ``` Define query requests, result envelopes, deterministic pagination, sorting, diagnostics, and correlation IDs. Acceptance: - Repeated equivalent queries return stable ordering within documented limits. - Results include asset IDs, representation references, metadata, source references, and diagnostics. - Invalid queries return structured validation errors. ## R7.2 - Implement lexical search over normalized content ```task id: KONT-WP-0007-T002 status: todo priority: high state_hub_task_id: "5ec90dcb-473c-4d01-85f2-8db18de0b7d1" ``` Implement MVP lexical search over normalized representations without making semantic/vector search a blocker. Acceptance: - Text search returns matching assets with relevance metadata. - Search indexes can be refreshed after ingestion or update. - p95 latency and zero-result rate can be measured in smoke tests. ## R7.3 - Implement metadata lifecycle and source-context filters ```task id: KONT-WP-0007-T003 status: todo priority: high state_hub_task_id: "9e7d0a5c-71d4-44ca-9b71-70f2206e4a02" ``` Support filters by asset type, collection, source, owner, tags, classification, sensitivity, lifecycle state, timestamps, and custom metadata. Acceptance: - Text search and metadata filters can be combined. - Lifecycle and sensitivity filters participate in permission checks. - Filter behavior is covered across in-memory and durable backends where supported. ## R7.4 - Implement contextual entity model and relationship retrieval ```task id: KONT-WP-0007-T004 status: todo priority: high state_hub_task_id: "b3358059-ac58-4e37-985c-6e8c1cc6df30" ``` Represent contextual entities such as people, teams, projects, cases, topics, source systems, processes, products, and generated artifacts. Acceptance: - Assets can be linked to contextual entities. - Relationship direction, type, validity, confidence, actor, and provenance are represented where available. - Callers can retrieve assets by project, case, topic, source, workflow run, or related asset. ## R7.5 - Enforce permission-aware retrieval and fail-closed semantics ```task id: KONT-WP-0007-T005 status: todo priority: high state_hub_task_id: "c6c93713-3ab1-41fb-bf35-15dd860b66fa" ``` Apply authorization and policy checks before returning content, metadata, snippets, relationships, derived artifacts, or context packages. Acceptance: - Unauthorized assets do not leak through result lists, snippets, relationship traversal, or derived answer packages. - Missing or stale permission context fails closed according to policy. - Retrieval audit events capture actor, query scope, outcome, and policy context. ## R7.6 - Return source-grounded snippets citations and explanation data ```task id: KONT-WP-0007-T006 status: todo priority: medium state_hub_task_id: "1a6d5a95-d87a-447a-a186-cb73162cd9a1" ``` Return matched regions, snippets, source references, representation IDs, relationship context, and citation-ready data for grounded AI workflows. Acceptance: - Results explain why they were returned and where they originated. - Snippets are permission filtered. - Retrieval packages are suitable for later grounded answer generation. - Markdown snippets can reference Markitect selector matches or context-package spans as adapter provenance. ## R7.7 - Capture retrieval feedback and KPI measurement hooks ```task id: KONT-WP-0007-T007 status: todo priority: medium state_hub_task_id: "e17e2839-400f-4348-98e3-f77acc0b2fde" ``` Capture relevance feedback and quality signals for retrieval improvement. Acceptance: - Feedback can mark results useful, irrelevant, missing, unsafe, or low confidence. - Query context and result metadata are stored with feedback. - Precision@k, zero-result rate, permission-filter latency, and citation precision have measurement hooks. ## Definition Of Done - Retrieval tests cover text, metadata, lifecycle, relationship, contextual entity, pagination, permission, snippet, and feedback behavior. - Retrieval does not bypass policy or source provenance. - Search, relationship, and context retrieval contracts follow `docs/architecture-blueprint.md`. - `python3 -m pytest` passes.