generated from coulomb/repo-seed
stable asset queries, lexical search, filters, contextual entity and relationship retrieval, permission-aware fail-closed behavior, source-grounded snippets, feedback capture, and KPI hooks
This commit is contained in:
@@ -4,13 +4,13 @@ type: workplan
|
||||
title: "Governed Retrieval And Context Graph"
|
||||
domain: markitect
|
||||
repo: kontextual-engine
|
||||
status: todo
|
||||
status: done
|
||||
owner: codex
|
||||
topic_slug: markitect
|
||||
planning_priority: high
|
||||
planning_order: 7
|
||||
created: "2026-05-05"
|
||||
updated: "2026-05-05"
|
||||
updated: "2026-05-06"
|
||||
state_hub_workstream_id: "64352515-9677-46bb-909a-9e2db4915dc7"
|
||||
---
|
||||
|
||||
@@ -44,11 +44,32 @@ produce grounded units and snippets. Engine retrieval contracts, result
|
||||
envelopes, policy filtering, pagination, feedback, and cross-format search
|
||||
remain engine-owned.
|
||||
|
||||
## Implementation Status
|
||||
|
||||
As of 2026-05-06, the first retrieval slice is recorded in
|
||||
`docs/retrieval-implementation.md`. It establishes asset query request/result
|
||||
contracts, stable sorting and pagination, result envelopes with source
|
||||
references, representations, metadata records, refreshable lexical search,
|
||||
relevance metadata, zero-result smoke metadata, and structured validation
|
||||
diagnostics. It also supports combined metadata, lifecycle, source-context,
|
||||
tag, collection, timestamp, and representation filters across in-memory and
|
||||
SQLite-backed repositories. The contextual graph slice adds direct contextual
|
||||
entity and relationship query envelopes plus asset filters by contextual
|
||||
entity, workflow run, related asset, and relationship predicate. Remaining work
|
||||
is focused on multi-hop graph traversal/ranking, source-grounded snippets, and
|
||||
feedback/KPI hooks. Permission-aware retrieval now uses the engine policy
|
||||
gateway for query-scope and per-resource checks, with fail-closed denied
|
||||
envelopes and retrieval audit events. Lexical queries can also return
|
||||
source-grounded snippet packets with representation/source references and
|
||||
adapter provenance. Feedback and KPI hooks persist retrieval feedback and
|
||||
derive zero-result, precision, citation precision, safety, confidence, and
|
||||
permission-filter timing signals.
|
||||
|
||||
## R7.1 - Implement query contracts pagination sorting and result envelopes
|
||||
|
||||
```task
|
||||
id: KONT-WP-0007-T001
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "5a1b0661-ce22-4ee6-a9e7-0aedce9d4356"
|
||||
```
|
||||
@@ -63,11 +84,22 @@ Acceptance:
|
||||
references, and diagnostics.
|
||||
- Invalid queries return structured validation errors.
|
||||
|
||||
Implemented:
|
||||
|
||||
- `AssetQueryRequest`, `AssetQueryItem`, `AssetQueryResult`, and
|
||||
`AssetRetrievalService` provide the stable asset query contract.
|
||||
- Queries return deterministic ordering with pagination metadata and
|
||||
correlation IDs.
|
||||
- Result entries expose asset identity, classification, source references,
|
||||
representations, and metadata records.
|
||||
- Invalid lifecycle, representation kind, sort key, sort order, limit, and
|
||||
offset return structured diagnostics without raising raw exceptions.
|
||||
|
||||
## R7.2 - Implement lexical search over normalized content
|
||||
|
||||
```task
|
||||
id: KONT-WP-0007-T002
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "5ec90dcb-473c-4d01-85f2-8db18de0b7d1"
|
||||
```
|
||||
@@ -81,11 +113,23 @@ Acceptance:
|
||||
- Search indexes can be refreshed after ingestion or update.
|
||||
- p95 latency and zero-result rate can be measured in smoke tests.
|
||||
|
||||
Implemented:
|
||||
|
||||
- Normalized ingestion now stores representation search text and length
|
||||
metadata for retrieval indexing.
|
||||
- `AssetRetrievalService.refresh_index()` builds a refreshable lexical index
|
||||
with indexed asset and representation counts.
|
||||
- Text queries perform lexical substring matching over normalized
|
||||
representations and return relevance metadata including strategy, query,
|
||||
match count, and matching representation IDs.
|
||||
- Query result metadata includes zero-result and lexical index statistics for
|
||||
later smoke/performance measurement.
|
||||
|
||||
## R7.3 - Implement metadata lifecycle and source-context filters
|
||||
|
||||
```task
|
||||
id: KONT-WP-0007-T003
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "9e7d0a5c-71d4-44ca-9b71-70f2206e4a02"
|
||||
```
|
||||
@@ -100,11 +144,24 @@ Acceptance:
|
||||
- Filter behavior is covered across in-memory and durable backends where
|
||||
supported.
|
||||
|
||||
Implemented:
|
||||
|
||||
- Asset queries support filters for asset type, lifecycle, sensitivity, owner,
|
||||
topic, review state, source system/path, representation kind, collection,
|
||||
tags, created/updated timestamp bounds, and custom metadata records.
|
||||
- Text search can be combined with standard, source, tag, collection,
|
||||
sensitivity, and metadata filters.
|
||||
- Combined filter behavior is covered over in-memory and SQLite-backed asset
|
||||
repositories.
|
||||
- Permission enforcement is intentionally deferred to R7.5; current lifecycle
|
||||
and sensitivity filters establish the policy inputs without claiming
|
||||
authorization semantics.
|
||||
|
||||
## R7.4 - Implement contextual entity model and relationship retrieval
|
||||
|
||||
```task
|
||||
id: KONT-WP-0007-T004
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "b3358059-ac58-4e37-985c-6e8c1cc6df30"
|
||||
```
|
||||
@@ -120,11 +177,27 @@ Acceptance:
|
||||
- Callers can retrieve assets by project, case, topic, source, workflow run, or
|
||||
related asset.
|
||||
|
||||
Implemented:
|
||||
|
||||
- Existing `ContextEntity`/`CoreRelationship` primitives are reused as the
|
||||
canonical model; entity types now include workflow runs and generated
|
||||
artifacts for operational graph use cases.
|
||||
- `ContextEntityQueryRequest`/`ContextEntityQueryResult` provide stable
|
||||
contextual entity lookup by type, name, external reference, and metadata.
|
||||
- `RelationshipQueryRequest`/`RelationshipQueryResult` provide stable
|
||||
relationship retrieval by source, target, asset, contextual entity,
|
||||
workflow run, predicate, target kind, and direction.
|
||||
- Asset queries can filter by contextual entity, workflow run, related asset,
|
||||
and relationship predicate while returning relationship and contextual
|
||||
entity context for matched assets.
|
||||
- Graph retrieval behavior is covered across in-memory and SQLite-backed
|
||||
repositories.
|
||||
|
||||
## R7.5 - Enforce permission-aware retrieval and fail-closed semantics
|
||||
|
||||
```task
|
||||
id: KONT-WP-0007-T005
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "c6c93713-3ab1-41fb-bf35-15dd860b66fa"
|
||||
```
|
||||
@@ -140,11 +213,25 @@ Acceptance:
|
||||
- Retrieval audit events capture actor, query scope, outcome, and policy
|
||||
context.
|
||||
|
||||
Implemented:
|
||||
|
||||
- Retrieval services accept the engine `PolicyGateway`, defaulting to the
|
||||
allow-all local adapter used elsewhere in the system.
|
||||
- Asset, contextual entity, and relationship queries authorize the query scope
|
||||
before loading result envelopes.
|
||||
- Assets, contextual entities, and relationships are policy-filtered before
|
||||
they are returned; relationships additionally require source and target
|
||||
resource visibility so traversal cannot reveal denied assets or entities.
|
||||
- Policy gateway failures produce empty denied envelopes with structured
|
||||
diagnostics and fail-closed policy decisions.
|
||||
- Retrieval audit events capture actor, correlation ID, query scope, policy
|
||||
decision, outcome, result counts, and internal permission-filter counts.
|
||||
|
||||
## R7.6 - Return source-grounded snippets citations and explanation data
|
||||
|
||||
```task
|
||||
id: KONT-WP-0007-T006
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "1a6d5a95-d87a-447a-a186-cb73162cd9a1"
|
||||
```
|
||||
@@ -160,11 +247,26 @@ Acceptance:
|
||||
- Markdown snippets can reference Markitect selector matches or context-package
|
||||
spans as adapter provenance.
|
||||
|
||||
Implemented:
|
||||
|
||||
- `RetrievalSnippet` packets expose asset, representation, source reference,
|
||||
storage reference, media type, match offsets, match text, snippet text, and
|
||||
adapter provenance.
|
||||
- Lexical asset queries can request snippets through `include_snippets`,
|
||||
`max_snippets`, and `snippet_radius`.
|
||||
- Snippets are generated from normalized representation search text and are
|
||||
attached only to policy-authorized asset results.
|
||||
- Markitect selectors, source spans, context spans, adapter provenance,
|
||||
snapshots, and extractor identity are preserved when supplied as
|
||||
representation metadata.
|
||||
- Snippet behavior is covered with permission filtering so denied matching
|
||||
content does not leak through snippet packets.
|
||||
|
||||
## R7.7 - Capture retrieval feedback and KPI measurement hooks
|
||||
|
||||
```task
|
||||
id: KONT-WP-0007-T007
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "e17e2839-400f-4348-98e3-f77acc0b2fde"
|
||||
```
|
||||
@@ -179,6 +281,20 @@ Acceptance:
|
||||
- Precision@k, zero-result rate, permission-filter latency, and citation
|
||||
precision have measurement hooks.
|
||||
|
||||
Implemented:
|
||||
|
||||
- `RetrievalFeedbackRecord` persists feedback labels for useful, irrelevant,
|
||||
missing, unsafe, and low-confidence outcomes with actor, correlation ID,
|
||||
query context, result references, notes, and metadata.
|
||||
- Asset registry repository ports and memory/SQLite adapters persist and list
|
||||
retrieval feedback.
|
||||
- `AssetRetrievalService.record_feedback()` records authorized feedback with
|
||||
structured diagnostics for invalid labels or denied feedback operations.
|
||||
- `AssetRetrievalService.quality_metrics()` derives zero-result rate,
|
||||
precision@k, citation precision, feedback totals, unsafe/low-confidence
|
||||
counts, and permission-filter timing observations from query results,
|
||||
feedback records, and retrieval audit events.
|
||||
|
||||
## Definition Of Done
|
||||
|
||||
- Retrieval tests cover text, metadata, lifecycle, relationship, contextual
|
||||
|
||||
Reference in New Issue
Block a user