Files
kontextual-engine/workplans/KONT-WP-0010-observability-export-enterprise-readiness.md

265 lines
8.8 KiB
Markdown

---
id: KONT-WP-0010
type: workplan
title: "Observability Export And Enterprise Readiness"
domain: markitect
repo: kontextual-engine
status: completed
owner: codex
topic_slug: markitect
planning_priority: high
planning_order: 10
created: "2026-05-05"
updated: "2026-05-06"
state_hub_workstream_id: "09d769a5-a3cf-4cdf-ae5e-b4ecf767f109"
---
# KONT-WP-0010: Observability Export And Enterprise Readiness
## Purpose
Add the operational surfaces that make the engine inspectable, recoverable,
portable, measurable, and ready for enterprise-oriented expansion: metrics,
events, job inspection, recovery actions, governed export packages, governance
inspection, extension hooks, backend abstraction readiness, quality signals,
cost signals, and MVP compliance reporting.
## Requirement Coverage
Primary: FR-200 to FR-207 and FR-220 to FR-225.
Supporting: FR-183 to FR-188, FR-127 to FR-132, FR-070, FR-166 to FR-168,
FR-240 to FR-245.
## Architecture Constraint
Implement observability, export, events, webhooks, and recovery through the
ports, services, audit model, and export package model described in
`docs/architecture-blueprint.md`. Export and observability must preserve policy
checks and must not require direct storage access.
## markitect-tool Boundary Remark
Observability and export should surface Markitect adapter provenance, snapshot
identity, selector references, context-package manifests, and operation
provenance where markdown-backed assets depend on them. Export formats remain
engine-owned and should include Markitect payloads as documented adapter
sections, not as the whole portability model.
## Implementation Status
Implemented as an operator/readiness layer on top of the existing runtime and
repository contracts. The MVP surfaces include operational metrics, job
inspection, event views, recovery actions, governed export packages, export
validation, governance reports, extension/event catalogs, quality/cost signal
recording, performance smoke summaries, and an MVP compliance report.
## E10.1 - Expose operational metrics events and job inspection
```task
id: KONT-WP-0010-T001
status: done
priority: high
state_hub_task_id: "ce6cfbc4-b171-4f03-a27b-c46abbde85a0"
```
Expose operational telemetry for ingestion, retrieval, indexing,
transformations, workflow jobs, permissions, audit, exports, and service
health.
Acceptance:
- Operators can inspect current and historical job state.
- Metrics include ingestion throughput, query latency, API latency, workflow
completion, failure rate, queue age, and storage/index health.
- Events use correlation IDs that line up with audit records.
Implemented:
- `ServiceRuntime.operational_metrics()` summarizes asset, ingestion,
retrieval, transformation, workflow, permission, queue, and readiness state.
- `inspect_jobs()` exposes ingestion, transformation, and workflow jobs/runs by
kind, status, and correlation ID.
- `operational_events()` exposes audit-backed operational events with
correlation IDs.
## E10.2 - Implement administrative recovery actions
```task
id: KONT-WP-0010-T002
status: done
priority: high
state_hub_task_id: "8f0ead65-79be-42e3-8ec8-43d146bb3934"
```
Provide authorized recovery actions for retry, re-run, re-index, cancel,
quarantine, repair, and failure inspection.
Acceptance:
- Recovery actions enforce permissions and audit events.
- Common ingestion, indexing, workflow, and transformation failures are
recoverable without direct database edits.
- Partial failure reports remain available after recovery.
Implemented:
- Recovery action catalog plus execution for ingestion retry, transformation
retry/cancel, workflow retry/cancel, retrieval re-index, and failure
inspection.
- Recovery actions authorize through `PolicyGateway` and emit audit events.
- Partial ingestion failure envelopes remain inspectable and tested.
## E10.3 - Implement export packages manifests and integrity validation
```task
id: KONT-WP-0010-T003
status: done
priority: high
state_hub_task_id: "54ed199f-636e-4cfd-898f-fd6ad0057b61"
```
Implement governed export packages for assets, normalized representations,
metadata, relationships, provenance, versions, audit references, and derived
artifacts.
Acceptance:
- Exports can be scoped by asset ID, collection, query, workflow run, source
system, lifecycle state, date range, or governance policy.
- Export manifests include schema version, counts, hashes, actor, time, and
policy context.
- Export validation can detect missing records or integrity mismatches.
Implemented:
- Governed export packages scoped by explicit asset IDs, filters, or retrieval
query.
- Export records include assets, metadata, representations, relationships,
versions, lineage, audit references, policy context, and Markitect adapter
sections.
- Export validation recomputes counts and content hash to detect tampering.
## E10.4 - Implement governance inspection and reporting hooks
```task
id: KONT-WP-0010-T004
status: done
priority: medium
state_hub_task_id: "c62c5f36-30d9-4469-90cf-5dc3d37588ba"
```
Expose governance inspection for permission coverage, policy gaps, stale
permissions, missing metadata, lifecycle exceptions, access anomalies, retention
coverage, legal holds, and audit completeness.
Acceptance:
- Governance reports can be generated for selected scopes.
- Reports identify under-classified, overexposed, stale, held, or
policy-conflicted assets.
- Reporting respects authorization and redaction policy.
Implemented:
- `governance_report()` generates scoped reports over selected assets.
- Findings cover missing owner, metadata, source refs, audit gaps, and
sensitive assets without review/retention metadata.
- Reports include redaction metadata and avoid embedding source content.
## E10.5 - Implement extension events webhooks and backend abstraction readiness
```task
id: KONT-WP-0010-T005
status: done
priority: medium
state_hub_task_id: "f1713b41-0535-47fc-ba7e-054aea93f8cf"
```
Prepare the extension surface for source adapters, extractors,
transformations, validators, policy modules, webhooks, events, and backend
swapping.
Acceptance:
- Extension points are documented and covered by contract tests.
- Events can be emitted for asset changes, ingestion completion, workflow
status, policy exceptions, derived artifact creation, and review decisions.
- Storage, index, queue, workflow, AI, and model backend abstractions remain
externally semantic-preserving.
- Markitect adapter contract tests are part of the extension compatibility
posture for markdown-related engine capabilities.
Implemented:
- Extension catalog exposes connector, extractor, transformation, event, and
backend abstraction readiness.
- Extension events can be emitted as audited semantic events.
- Markitect adapter provenance and boundary are explicit in export and
extension surfaces.
## E10.6 - Capture retrieval AI cost and quality signals
```task
id: KONT-WP-0010-T006
status: done
priority: medium
state_hub_task_id: "1d36035a-b211-49e9-935c-382d52aa3639"
```
Capture retrieval quality, AI operation, and cost signals where available.
Acceptance:
- Retrieval metrics include precision hooks, zero-result rate, low-confidence
result rate, and feedback counts.
- AI usage can record model calls, token or compute usage, provider errors, and
estimated operation cost where adapters provide them.
- Signals can be attributed to assets, workflows, agents, applications, and
actors.
Implemented:
- Retrieval quality metrics are exposed in operator metrics and
quality/cost reports.
- `record_quality_signal()` captures AI usage, cost, metrics, and attribution
dimensions as audit-backed signal events.
- `quality_cost_signals()` aggregates retrieval quality, AI usage, provider
error count, and estimated cost.
## E10.7 - Add performance smoke tests and MVP compliance report
```task
id: KONT-WP-0010-T007
status: done
priority: medium
state_hub_task_id: "057c7bcf-f224-4d9f-9161-6bfff4948e95"
```
Create smoke tests and a compliance report against the V0.2 MVP acceptance
perspective.
Acceptance:
- Smoke tests measure representative ingestion, query, workflow, and export
behavior.
- MVP compliance report maps implemented behavior to FRS P0 requirements.
- Remaining P1/P2 gaps are explicit and prioritized.
Implemented:
- `performance_smoke_report()` summarizes representative ingestion, retrieval,
workflow, and export observations.
- `mvp_compliance_report()` maps MVP behavior to observability/recovery,
export, governance/audit, and agent-safe operation requirements.
- Remaining enterprise-adapter gaps are explicit in the compliance report.
## Definition Of Done
- Operators can inspect, diagnose, recover, export, and evaluate MVP engine
behavior through supported surfaces.
- Export packages preserve enough context for inspection and migration.
- Observability, events, recovery, and export follow
`docs/architecture-blueprint.md`.
- `python3 -m pytest` passes.