generated from coulomb/repo-seed
265 lines
8.8 KiB
Markdown
265 lines
8.8 KiB
Markdown
---
|
|
id: KONT-WP-0010
|
|
type: workplan
|
|
title: "Observability Export And Enterprise Readiness"
|
|
domain: markitect
|
|
repo: kontextual-engine
|
|
status: completed
|
|
owner: codex
|
|
topic_slug: markitect
|
|
planning_priority: high
|
|
planning_order: 10
|
|
created: "2026-05-05"
|
|
updated: "2026-05-06"
|
|
state_hub_workstream_id: "09d769a5-a3cf-4cdf-ae5e-b4ecf767f109"
|
|
---
|
|
|
|
# KONT-WP-0010: Observability Export And Enterprise Readiness
|
|
|
|
## Purpose
|
|
|
|
Add the operational surfaces that make the engine inspectable, recoverable,
|
|
portable, measurable, and ready for enterprise-oriented expansion: metrics,
|
|
events, job inspection, recovery actions, governed export packages, governance
|
|
inspection, extension hooks, backend abstraction readiness, quality signals,
|
|
cost signals, and MVP compliance reporting.
|
|
|
|
## Requirement Coverage
|
|
|
|
Primary: FR-200 to FR-207 and FR-220 to FR-225.
|
|
|
|
Supporting: FR-183 to FR-188, FR-127 to FR-132, FR-070, FR-166 to FR-168,
|
|
FR-240 to FR-245.
|
|
|
|
## Architecture Constraint
|
|
|
|
Implement observability, export, events, webhooks, and recovery through the
|
|
ports, services, audit model, and export package model described in
|
|
`docs/architecture-blueprint.md`. Export and observability must preserve policy
|
|
checks and must not require direct storage access.
|
|
|
|
## markitect-tool Boundary Remark
|
|
|
|
Observability and export should surface Markitect adapter provenance, snapshot
|
|
identity, selector references, context-package manifests, and operation
|
|
provenance where markdown-backed assets depend on them. Export formats remain
|
|
engine-owned and should include Markitect payloads as documented adapter
|
|
sections, not as the whole portability model.
|
|
|
|
## Implementation Status
|
|
|
|
Implemented as an operator/readiness layer on top of the existing runtime and
|
|
repository contracts. The MVP surfaces include operational metrics, job
|
|
inspection, event views, recovery actions, governed export packages, export
|
|
validation, governance reports, extension/event catalogs, quality/cost signal
|
|
recording, performance smoke summaries, and an MVP compliance report.
|
|
|
|
## E10.1 - Expose operational metrics events and job inspection
|
|
|
|
```task
|
|
id: KONT-WP-0010-T001
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "ce6cfbc4-b171-4f03-a27b-c46abbde85a0"
|
|
```
|
|
|
|
Expose operational telemetry for ingestion, retrieval, indexing,
|
|
transformations, workflow jobs, permissions, audit, exports, and service
|
|
health.
|
|
|
|
Acceptance:
|
|
|
|
- Operators can inspect current and historical job state.
|
|
- Metrics include ingestion throughput, query latency, API latency, workflow
|
|
completion, failure rate, queue age, and storage/index health.
|
|
- Events use correlation IDs that line up with audit records.
|
|
|
|
Implemented:
|
|
|
|
- `ServiceRuntime.operational_metrics()` summarizes asset, ingestion,
|
|
retrieval, transformation, workflow, permission, queue, and readiness state.
|
|
- `inspect_jobs()` exposes ingestion, transformation, and workflow jobs/runs by
|
|
kind, status, and correlation ID.
|
|
- `operational_events()` exposes audit-backed operational events with
|
|
correlation IDs.
|
|
|
|
## E10.2 - Implement administrative recovery actions
|
|
|
|
```task
|
|
id: KONT-WP-0010-T002
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "8f0ead65-79be-42e3-8ec8-43d146bb3934"
|
|
```
|
|
|
|
Provide authorized recovery actions for retry, re-run, re-index, cancel,
|
|
quarantine, repair, and failure inspection.
|
|
|
|
Acceptance:
|
|
|
|
- Recovery actions enforce permissions and audit events.
|
|
- Common ingestion, indexing, workflow, and transformation failures are
|
|
recoverable without direct database edits.
|
|
- Partial failure reports remain available after recovery.
|
|
|
|
Implemented:
|
|
|
|
- Recovery action catalog plus execution for ingestion retry, transformation
|
|
retry/cancel, workflow retry/cancel, retrieval re-index, and failure
|
|
inspection.
|
|
- Recovery actions authorize through `PolicyGateway` and emit audit events.
|
|
- Partial ingestion failure envelopes remain inspectable and tested.
|
|
|
|
## E10.3 - Implement export packages manifests and integrity validation
|
|
|
|
```task
|
|
id: KONT-WP-0010-T003
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "54ed199f-636e-4cfd-898f-fd6ad0057b61"
|
|
```
|
|
|
|
Implement governed export packages for assets, normalized representations,
|
|
metadata, relationships, provenance, versions, audit references, and derived
|
|
artifacts.
|
|
|
|
Acceptance:
|
|
|
|
- Exports can be scoped by asset ID, collection, query, workflow run, source
|
|
system, lifecycle state, date range, or governance policy.
|
|
- Export manifests include schema version, counts, hashes, actor, time, and
|
|
policy context.
|
|
- Export validation can detect missing records or integrity mismatches.
|
|
|
|
Implemented:
|
|
|
|
- Governed export packages scoped by explicit asset IDs, filters, or retrieval
|
|
query.
|
|
- Export records include assets, metadata, representations, relationships,
|
|
versions, lineage, audit references, policy context, and Markitect adapter
|
|
sections.
|
|
- Export validation recomputes counts and content hash to detect tampering.
|
|
|
|
## E10.4 - Implement governance inspection and reporting hooks
|
|
|
|
```task
|
|
id: KONT-WP-0010-T004
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "c62c5f36-30d9-4469-90cf-5dc3d37588ba"
|
|
```
|
|
|
|
Expose governance inspection for permission coverage, policy gaps, stale
|
|
permissions, missing metadata, lifecycle exceptions, access anomalies, retention
|
|
coverage, legal holds, and audit completeness.
|
|
|
|
Acceptance:
|
|
|
|
- Governance reports can be generated for selected scopes.
|
|
- Reports identify under-classified, overexposed, stale, held, or
|
|
policy-conflicted assets.
|
|
- Reporting respects authorization and redaction policy.
|
|
|
|
Implemented:
|
|
|
|
- `governance_report()` generates scoped reports over selected assets.
|
|
- Findings cover missing owner, metadata, source refs, audit gaps, and
|
|
sensitive assets without review/retention metadata.
|
|
- Reports include redaction metadata and avoid embedding source content.
|
|
|
|
## E10.5 - Implement extension events webhooks and backend abstraction readiness
|
|
|
|
```task
|
|
id: KONT-WP-0010-T005
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "f1713b41-0535-47fc-ba7e-054aea93f8cf"
|
|
```
|
|
|
|
Prepare the extension surface for source adapters, extractors,
|
|
transformations, validators, policy modules, webhooks, events, and backend
|
|
swapping.
|
|
|
|
Acceptance:
|
|
|
|
- Extension points are documented and covered by contract tests.
|
|
- Events can be emitted for asset changes, ingestion completion, workflow
|
|
status, policy exceptions, derived artifact creation, and review decisions.
|
|
- Storage, index, queue, workflow, AI, and model backend abstractions remain
|
|
externally semantic-preserving.
|
|
- Markitect adapter contract tests are part of the extension compatibility
|
|
posture for markdown-related engine capabilities.
|
|
|
|
Implemented:
|
|
|
|
- Extension catalog exposes connector, extractor, transformation, event, and
|
|
backend abstraction readiness.
|
|
- Extension events can be emitted as audited semantic events.
|
|
- Markitect adapter provenance and boundary are explicit in export and
|
|
extension surfaces.
|
|
|
|
## E10.6 - Capture retrieval AI cost and quality signals
|
|
|
|
```task
|
|
id: KONT-WP-0010-T006
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "1d36035a-b211-49e9-935c-382d52aa3639"
|
|
```
|
|
|
|
Capture retrieval quality, AI operation, and cost signals where available.
|
|
|
|
Acceptance:
|
|
|
|
- Retrieval metrics include precision hooks, zero-result rate, low-confidence
|
|
result rate, and feedback counts.
|
|
- AI usage can record model calls, token or compute usage, provider errors, and
|
|
estimated operation cost where adapters provide them.
|
|
- Signals can be attributed to assets, workflows, agents, applications, and
|
|
actors.
|
|
|
|
Implemented:
|
|
|
|
- Retrieval quality metrics are exposed in operator metrics and
|
|
quality/cost reports.
|
|
- `record_quality_signal()` captures AI usage, cost, metrics, and attribution
|
|
dimensions as audit-backed signal events.
|
|
- `quality_cost_signals()` aggregates retrieval quality, AI usage, provider
|
|
error count, and estimated cost.
|
|
|
|
## E10.7 - Add performance smoke tests and MVP compliance report
|
|
|
|
```task
|
|
id: KONT-WP-0010-T007
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "057c7bcf-f224-4d9f-9161-6bfff4948e95"
|
|
```
|
|
|
|
Create smoke tests and a compliance report against the V0.2 MVP acceptance
|
|
perspective.
|
|
|
|
Acceptance:
|
|
|
|
- Smoke tests measure representative ingestion, query, workflow, and export
|
|
behavior.
|
|
- MVP compliance report maps implemented behavior to FRS P0 requirements.
|
|
- Remaining P1/P2 gaps are explicit and prioritized.
|
|
|
|
Implemented:
|
|
|
|
- `performance_smoke_report()` summarizes representative ingestion, retrieval,
|
|
workflow, and export observations.
|
|
- `mvp_compliance_report()` maps MVP behavior to observability/recovery,
|
|
export, governance/audit, and agent-safe operation requirements.
|
|
- Remaining enterprise-adapter gaps are explicit in the compliance report.
|
|
|
|
## Definition Of Done
|
|
|
|
- Operators can inspect, diagnose, recover, export, and evaluate MVP engine
|
|
behavior through supported surfaces.
|
|
- Export packages preserve enough context for inspection and migration.
|
|
- Observability, events, recovery, and export follow
|
|
`docs/architecture-blueprint.md`.
|
|
- `python3 -m pytest` passes.
|