stable asset queries, lexical search, filters, contextual entity and relationship retrieval, permission-aware fail-closed behavior, source-grounded snippets, feedback capture, and KPI hooks

2026-05-06 16:27:03 +02:00
parent 80a3e59701
commit 1e3c6fe34a
13 changed files with 3173 additions and 9 deletions
--- a/workplans/KONT-WP-0007-governed-retrieval-context-graph.md
+++ b/workplans/KONT-WP-0007-governed-retrieval-context-graph.md
@@ -4,13 +4,13 @@ type: workplan
 title: "Governed Retrieval And Context Graph"
 domain: markitect
 repo: kontextual-engine
-status: todo
+status: done
 owner: codex
 topic_slug: markitect
 planning_priority: high
 planning_order: 7
 created: "2026-05-05"
-updated: "2026-05-05"
+updated: "2026-05-06"
 state_hub_workstream_id: "64352515-9677-46bb-909a-9e2db4915dc7"
 ---

@@ -44,11 +44,32 @@ produce grounded units and snippets. Engine retrieval contracts, result
 envelopes, policy filtering, pagination, feedback, and cross-format search
 remain engine-owned.

+## Implementation Status
+
+As of 2026-05-06, the first retrieval slice is recorded in
+`docs/retrieval-implementation.md`. It establishes asset query request/result
+contracts, stable sorting and pagination, result envelopes with source
+references, representations, metadata records, refreshable lexical search,
+relevance metadata, zero-result smoke metadata, and structured validation
+diagnostics. It also supports combined metadata, lifecycle, source-context,
+tag, collection, timestamp, and representation filters across in-memory and
+SQLite-backed repositories. The contextual graph slice adds direct contextual
+entity and relationship query envelopes plus asset filters by contextual
+entity, workflow run, related asset, and relationship predicate. Remaining work
+is focused on multi-hop graph traversal/ranking, source-grounded snippets, and
+feedback/KPI hooks. Permission-aware retrieval now uses the engine policy
+gateway for query-scope and per-resource checks, with fail-closed denied
+envelopes and retrieval audit events. Lexical queries can also return
+source-grounded snippet packets with representation/source references and
+adapter provenance. Feedback and KPI hooks persist retrieval feedback and
+derive zero-result, precision, citation precision, safety, confidence, and
+permission-filter timing signals.
+
 ## R7.1 - Implement query contracts pagination sorting and result envelopes

 ```task
 id: KONT-WP-0007-T001
-status: todo
+status: done
 priority: high
 state_hub_task_id: "5a1b0661-ce22-4ee6-a9e7-0aedce9d4356"
 ```
@@ -63,11 +84,22 @@ Acceptance:
  references, and diagnostics.
 - Invalid queries return structured validation errors.

+Implemented:
+
+- `AssetQueryRequest`, `AssetQueryItem`, `AssetQueryResult`, and
+  `AssetRetrievalService` provide the stable asset query contract.
+- Queries return deterministic ordering with pagination metadata and
+  correlation IDs.
+- Result entries expose asset identity, classification, source references,
+  representations, and metadata records.
+- Invalid lifecycle, representation kind, sort key, sort order, limit, and
+  offset return structured diagnostics without raising raw exceptions.
+
 ## R7.2 - Implement lexical search over normalized content

 ```task
 id: KONT-WP-0007-T002
-status: todo
+status: done
 priority: high
 state_hub_task_id: "5ec90dcb-473c-4d01-85f2-8db18de0b7d1"
 ```
@@ -81,11 +113,23 @@ Acceptance:
 - Search indexes can be refreshed after ingestion or update.
 - p95 latency and zero-result rate can be measured in smoke tests.

+Implemented:
+
+- Normalized ingestion now stores representation search text and length
+  metadata for retrieval indexing.
+- `AssetRetrievalService.refresh_index()` builds a refreshable lexical index
+  with indexed asset and representation counts.
+- Text queries perform lexical substring matching over normalized
+  representations and return relevance metadata including strategy, query,
+  match count, and matching representation IDs.
+- Query result metadata includes zero-result and lexical index statistics for
+  later smoke/performance measurement.
+
 ## R7.3 - Implement metadata lifecycle and source-context filters

 ```task
 id: KONT-WP-0007-T003
-status: todo
+status: done
 priority: high
 state_hub_task_id: "9e7d0a5c-71d4-44ca-9b71-70f2206e4a02"
 ```
@@ -100,11 +144,24 @@ Acceptance:
 - Filter behavior is covered across in-memory and durable backends where
  supported.

+Implemented:
+
+- Asset queries support filters for asset type, lifecycle, sensitivity, owner,
+  topic, review state, source system/path, representation kind, collection,
+  tags, created/updated timestamp bounds, and custom metadata records.
+- Text search can be combined with standard, source, tag, collection,
+  sensitivity, and metadata filters.
+- Combined filter behavior is covered over in-memory and SQLite-backed asset
+  repositories.
+- Permission enforcement is intentionally deferred to R7.5; current lifecycle
+  and sensitivity filters establish the policy inputs without claiming
+  authorization semantics.
+
 ## R7.4 - Implement contextual entity model and relationship retrieval

 ```task
 id: KONT-WP-0007-T004
-status: todo
+status: done
 priority: high
 state_hub_task_id: "b3358059-ac58-4e37-985c-6e8c1cc6df30"
 ```
@@ -120,11 +177,27 @@ Acceptance:
 - Callers can retrieve assets by project, case, topic, source, workflow run, or
  related asset.

+Implemented:
+
+- Existing `ContextEntity`/`CoreRelationship` primitives are reused as the
+  canonical model; entity types now include workflow runs and generated
+  artifacts for operational graph use cases.
+- `ContextEntityQueryRequest`/`ContextEntityQueryResult` provide stable
+  contextual entity lookup by type, name, external reference, and metadata.
+- `RelationshipQueryRequest`/`RelationshipQueryResult` provide stable
+  relationship retrieval by source, target, asset, contextual entity,
+  workflow run, predicate, target kind, and direction.
+- Asset queries can filter by contextual entity, workflow run, related asset,
+  and relationship predicate while returning relationship and contextual
+  entity context for matched assets.
+- Graph retrieval behavior is covered across in-memory and SQLite-backed
+  repositories.
+
 ## R7.5 - Enforce permission-aware retrieval and fail-closed semantics

 ```task
 id: KONT-WP-0007-T005
-status: todo
+status: done
 priority: high
 state_hub_task_id: "c6c93713-3ab1-41fb-bf35-15dd860b66fa"
 ```
@@ -140,11 +213,25 @@ Acceptance:
 - Retrieval audit events capture actor, query scope, outcome, and policy
  context.

+Implemented:
+
+- Retrieval services accept the engine `PolicyGateway`, defaulting to the
+  allow-all local adapter used elsewhere in the system.
+- Asset, contextual entity, and relationship queries authorize the query scope
+  before loading result envelopes.
+- Assets, contextual entities, and relationships are policy-filtered before
+  they are returned; relationships additionally require source and target
+  resource visibility so traversal cannot reveal denied assets or entities.
+- Policy gateway failures produce empty denied envelopes with structured
+  diagnostics and fail-closed policy decisions.
+- Retrieval audit events capture actor, correlation ID, query scope, policy
+  decision, outcome, result counts, and internal permission-filter counts.
+
 ## R7.6 - Return source-grounded snippets citations and explanation data

 ```task
 id: KONT-WP-0007-T006
-status: todo
+status: done
 priority: medium
 state_hub_task_id: "1a6d5a95-d87a-447a-a186-cb73162cd9a1"
 ```
@@ -160,11 +247,26 @@ Acceptance:
 - Markdown snippets can reference Markitect selector matches or context-package
  spans as adapter provenance.

+Implemented:
+
+- `RetrievalSnippet` packets expose asset, representation, source reference,
+  storage reference, media type, match offsets, match text, snippet text, and
+  adapter provenance.
+- Lexical asset queries can request snippets through `include_snippets`,
+  `max_snippets`, and `snippet_radius`.
+- Snippets are generated from normalized representation search text and are
+  attached only to policy-authorized asset results.
+- Markitect selectors, source spans, context spans, adapter provenance,
+  snapshots, and extractor identity are preserved when supplied as
+  representation metadata.
+- Snippet behavior is covered with permission filtering so denied matching
+  content does not leak through snippet packets.
+
 ## R7.7 - Capture retrieval feedback and KPI measurement hooks

 ```task
 id: KONT-WP-0007-T007
-status: todo
+status: done
 priority: medium
 state_hub_task_id: "e17e2839-400f-4348-98e3-f77acc0b2fde"
 ```
@@ -179,6 +281,20 @@ Acceptance:
 - Precision@k, zero-result rate, permission-filter latency, and citation
  precision have measurement hooks.

+Implemented:
+
+- `RetrievalFeedbackRecord` persists feedback labels for useful, irrelevant,
+  missing, unsafe, and low-confidence outcomes with actor, correlation ID,
+  query context, result references, notes, and metadata.
+- Asset registry repository ports and memory/SQLite adapters persist and list
+  retrieval feedback.
+- `AssetRetrievalService.record_feedback()` records authorized feedback with
+  structured diagnostics for invalid labels or denied feedback operations.
+- `AssetRetrievalService.quality_metrics()` derives zero-result rate,
+  precision@k, citation precision, feedback totals, unsafe/low-confidence
+  counts, and permission-filter timing observations from query results,
+  feedback records, and retrieval audit events.
+
 ## Definition Of Done

 - Retrieval tests cover text, metadata, lifecycle, relationship, contextual