Files

tegwick a44b439cc7 Operator metrics, job inspection, and event views, Recovery, Governance reports, Extension catalog and semantic extension events

2026-05-06 21:48:40 +02:00

8.8 KiB

Raw Blame History

id, type, title, domain, repo, status, owner, topic_slug, planning_priority, planning_order, created, updated, state_hub_workstream_id

id	type	title	domain	repo	status	owner	topic_slug	planning_priority	planning_order	created	updated	state_hub_workstream_id
KONT-WP-0010	workplan	Observability Export And Enterprise Readiness	markitect	kontextual-engine	completed	codex	markitect	high	10	2026-05-05	2026-05-06	09d769a5-a3cf-4cdf-ae5e-b4ecf767f109

KONT-WP-0010: Observability Export And Enterprise Readiness

Purpose

Add the operational surfaces that make the engine inspectable, recoverable, portable, measurable, and ready for enterprise-oriented expansion: metrics, events, job inspection, recovery actions, governed export packages, governance inspection, extension hooks, backend abstraction readiness, quality signals, cost signals, and MVP compliance reporting.

Requirement Coverage

Primary: FR-200 to FR-207 and FR-220 to FR-225.

Supporting: FR-183 to FR-188, FR-127 to FR-132, FR-070, FR-166 to FR-168, FR-240 to FR-245.

Architecture Constraint

Implement observability, export, events, webhooks, and recovery through the ports, services, audit model, and export package model described in docs/architecture-blueprint.md. Export and observability must preserve policy checks and must not require direct storage access.

markitect-tool Boundary Remark

Observability and export should surface Markitect adapter provenance, snapshot identity, selector references, context-package manifests, and operation provenance where markdown-backed assets depend on them. Export formats remain engine-owned and should include Markitect payloads as documented adapter sections, not as the whole portability model.

Implementation Status

Implemented as an operator/readiness layer on top of the existing runtime and repository contracts. The MVP surfaces include operational metrics, job inspection, event views, recovery actions, governed export packages, export validation, governance reports, extension/event catalogs, quality/cost signal recording, performance smoke summaries, and an MVP compliance report.

E10.1 - Expose operational metrics events and job inspection

id: KONT-WP-0010-T001
status: done
priority: high
state_hub_task_id: "ce6cfbc4-b171-4f03-a27b-c46abbde85a0"

Expose operational telemetry for ingestion, retrieval, indexing, transformations, workflow jobs, permissions, audit, exports, and service health.

Acceptance:

Operators can inspect current and historical job state.
Metrics include ingestion throughput, query latency, API latency, workflow completion, failure rate, queue age, and storage/index health.
Events use correlation IDs that line up with audit records.

Implemented:

ServiceRuntime.operational_metrics() summarizes asset, ingestion, retrieval, transformation, workflow, permission, queue, and readiness state.
inspect_jobs() exposes ingestion, transformation, and workflow jobs/runs by kind, status, and correlation ID.
operational_events() exposes audit-backed operational events with correlation IDs.

E10.2 - Implement administrative recovery actions

id: KONT-WP-0010-T002
status: done
priority: high
state_hub_task_id: "8f0ead65-79be-42e3-8ec8-43d146bb3934"

Provide authorized recovery actions for retry, re-run, re-index, cancel, quarantine, repair, and failure inspection.

Acceptance:

Recovery actions enforce permissions and audit events.
Common ingestion, indexing, workflow, and transformation failures are recoverable without direct database edits.
Partial failure reports remain available after recovery.

Implemented:

Recovery action catalog plus execution for ingestion retry, transformation retry/cancel, workflow retry/cancel, retrieval re-index, and failure inspection.
Recovery actions authorize through PolicyGateway and emit audit events.
Partial ingestion failure envelopes remain inspectable and tested.

E10.3 - Implement export packages manifests and integrity validation

id: KONT-WP-0010-T003
status: done
priority: high
state_hub_task_id: "54ed199f-636e-4cfd-898f-fd6ad0057b61"

Implement governed export packages for assets, normalized representations, metadata, relationships, provenance, versions, audit references, and derived artifacts.

Acceptance:

Exports can be scoped by asset ID, collection, query, workflow run, source system, lifecycle state, date range, or governance policy.
Export manifests include schema version, counts, hashes, actor, time, and policy context.
Export validation can detect missing records or integrity mismatches.

Implemented:

Governed export packages scoped by explicit asset IDs, filters, or retrieval query.
Export records include assets, metadata, representations, relationships, versions, lineage, audit references, policy context, and Markitect adapter sections.
Export validation recomputes counts and content hash to detect tampering.

E10.4 - Implement governance inspection and reporting hooks

id: KONT-WP-0010-T004
status: done
priority: medium
state_hub_task_id: "c62c5f36-30d9-4469-90cf-5dc3d37588ba"

Expose governance inspection for permission coverage, policy gaps, stale permissions, missing metadata, lifecycle exceptions, access anomalies, retention coverage, legal holds, and audit completeness.

Acceptance:

Governance reports can be generated for selected scopes.
Reports identify under-classified, overexposed, stale, held, or policy-conflicted assets.
Reporting respects authorization and redaction policy.

Implemented:

governance_report() generates scoped reports over selected assets.
Findings cover missing owner, metadata, source refs, audit gaps, and sensitive assets without review/retention metadata.
Reports include redaction metadata and avoid embedding source content.

E10.5 - Implement extension events webhooks and backend abstraction readiness

id: KONT-WP-0010-T005
status: done
priority: medium
state_hub_task_id: "f1713b41-0535-47fc-ba7e-054aea93f8cf"

Prepare the extension surface for source adapters, extractors, transformations, validators, policy modules, webhooks, events, and backend swapping.

Acceptance:

Extension points are documented and covered by contract tests.
Events can be emitted for asset changes, ingestion completion, workflow status, policy exceptions, derived artifact creation, and review decisions.
Storage, index, queue, workflow, AI, and model backend abstractions remain externally semantic-preserving.
Markitect adapter contract tests are part of the extension compatibility posture for markdown-related engine capabilities.

Implemented:

Extension catalog exposes connector, extractor, transformation, event, and backend abstraction readiness.
Extension events can be emitted as audited semantic events.
Markitect adapter provenance and boundary are explicit in export and extension surfaces.

E10.6 - Capture retrieval AI cost and quality signals

id: KONT-WP-0010-T006
status: done
priority: medium
state_hub_task_id: "1d36035a-b211-49e9-935c-382d52aa3639"

Capture retrieval quality, AI operation, and cost signals where available.

Acceptance:

Retrieval metrics include precision hooks, zero-result rate, low-confidence result rate, and feedback counts.
AI usage can record model calls, token or compute usage, provider errors, and estimated operation cost where adapters provide them.
Signals can be attributed to assets, workflows, agents, applications, and actors.

Implemented:

Retrieval quality metrics are exposed in operator metrics and quality/cost reports.
record_quality_signal() captures AI usage, cost, metrics, and attribution dimensions as audit-backed signal events.
quality_cost_signals() aggregates retrieval quality, AI usage, provider error count, and estimated cost.

E10.7 - Add performance smoke tests and MVP compliance report

id: KONT-WP-0010-T007
status: done
priority: medium
state_hub_task_id: "057c7bcf-f224-4d9f-9161-6bfff4948e95"

Create smoke tests and a compliance report against the V0.2 MVP acceptance perspective.

Acceptance:

Smoke tests measure representative ingestion, query, workflow, and export behavior.
MVP compliance report maps implemented behavior to FRS P0 requirements.
Remaining P1/P2 gaps are explicit and prioritized.

Implemented:

performance_smoke_report() summarizes representative ingestion, retrieval, workflow, and export observations.
mvp_compliance_report() maps MVP behavior to observability/recovery, export, governance/audit, and agent-safe operation requirements.
Remaining enterprise-adapter gaps are explicit in the compliance report.

Definition Of Done

Operators can inspect, diagnose, recover, export, and evaluate MVP engine behavior through supported surfaces.
Export packages preserve enough context for inspection and migration.
Observability, events, recovery, and export follow docs/architecture-blueprint.md.
python3 -m pytest passes.

8.8 KiB Raw Blame History

KONT-WP-0010: Observability Export And Enterprise Readiness

Purpose

Requirement Coverage

Architecture Constraint

markitect-tool Boundary Remark

Implementation Status

E10.1 - Expose operational metrics events and job inspection

E10.2 - Implement administrative recovery actions

E10.3 - Implement export packages manifests and integrity validation

E10.4 - Implement governance inspection and reporting hooks

E10.5 - Implement extension events webhooks and backend abstraction readiness

E10.6 - Capture retrieval AI cost and quality signals

E10.7 - Add performance smoke tests and MVP compliance report

Definition Of Done

8.8 KiB

Raw Blame History