Files
kontextual-engine/workplans/KONT-WP-0005-asset-registry-governance-state.md

8.6 KiB

id, type, title, domain, repo, status, owner, topic_slug, planning_priority, planning_order, created, updated, state_hub_workstream_id
id type title domain repo status owner topic_slug planning_priority planning_order created updated state_hub_workstream_id
KONT-WP-0005 workplan Asset Registry Governance And Durable State markitect kontextual-engine done codex markitect high 5 2026-05-05 2026-05-06 231a7794-aa3b-4763-a556-80b4cea731c8

KONT-WP-0005: Asset Registry Governance And Durable State

Purpose

Implement the governed knowledge asset registry that underpins the V0.2 product vision: stable asset identity, source references, source/normalized/derived representations, metadata, classification, lifecycle state, actors, authorization checks, audit events, versioning, and durable local-first state.

Requirement Coverage

Primary: FR-001 to FR-010, FR-040 to FR-049, FR-120 to FR-126, FR-140 to FR-145, FR-240 to FR-245.

Supporting: FR-180 to FR-182, FR-200 to FR-201.

Architecture Constraint

Implement this slice through the domain core, application services, repository ports, policy port, audit port, and SQLite/in-memory adapters described in docs/architecture-blueprint.md. The asset registry must not depend on HTTP, source connectors, document extractors, search backends, or AI providers.

markitect-tool Boundary Remark

The asset registry may persist Markitect snapshot IDs, parser metadata, frontmatter-derived metadata, selector references, and operation provenance as adapter metadata on representations or versions. It must not make Markitect document classes canonical engine entities, and asset identity must remain independent of Markitect snapshot identity.

Markdown proxy documents are valid source, normalized, or derived representations for assets when Markitect selectors, contracts, document schemas, or workflows are useful. They remain adapter representations under engine governance; the registry still owns identity, metadata, lifecycle, policy, lineage, and audit.

Implementation Note

The first registry slice is recorded in docs/asset-registry-implementation.md. It establishes repository ports, memory and SQLite adapters, and the asset registry service for create, metadata, representation, lifecycle, relationship, idempotency, policy, audit, versions, and durable reload behavior.

Implementation Status

As of 2026-05-06, the registry core has a working asset service, in-memory and SQLite repositories, policy gateway boundary, audit events, versions, representations, metadata records, context entities, asset/context relationships, idempotent asset creation, and custom metadata schema validation before registry writes. It now also includes a durable metadata schema registry and assignment rules for policy-selected validation, structured operation failures, metadata batch partial-failure envelopes, and durable SQLite reference checks for versions, audit actors, ingestion job actors, metadata schema assignments, and relationship targets. This foundation workplan is complete; enterprise policy adapters, richer policy-assignment language, and production concurrency controls are intentionally left to adjacent workplans.

G5.1 - Implement stable asset identity and source references

id: KONT-WP-0005-T001
status: done
priority: high
state_hub_task_id: "7d61a11c-ca14-4075-ab0b-897bdfe57cb1"

Replace artifact-centric naming with knowledge asset identity that survives rename, move, re-ingestion, representation changes, and transformation.

Acceptance:

  • Assets have stable IDs, source references, source aliases, and content digests.
  • Source system, source path/URL/external ID, checksum, ingestion actor, and ingestion time can be represented.
  • Existing artifact tests are migrated or wrapped without losing deterministic digest behavior.

G5.2 - Represent source normalized and derived asset forms

id: KONT-WP-0005-T002
status: done
priority: high
state_hub_task_id: "cd0a2b0a-a2a0-426e-8b8c-6013cd6b9303"

Introduce explicit representation records for original/source-near content, normalized engine content, and derived artifacts.

Acceptance:

  • Retrieval can distinguish source content from normalized content.
  • Derived artifacts are stored as asset-linked records, not detached strings.
  • Representation metadata includes media type, digest, size, extractor or producer, and provenance.
  • Markdown representation metadata can include serialized Markitect snapshot identity without coupling engine identity to it.

G5.3 - Implement metadata classification lifecycle and schema validation

id: KONT-WP-0005-T003
status: done
priority: high
state_hub_task_id: "b06c5124-ce54-4241-b712-2fbab856877b"

Implement standard metadata, custom metadata schemas, classification, sensitivity, lifecycle state, tags, ownership, and validation behavior.

Acceptance:

  • Assets can be filtered by standard metadata and lifecycle state.
  • Custom schema validation produces structured validation errors.
  • Inferred and confirmed metadata can be distinguished for later review flows.

G5.4 - Implement actor authorization and policy baseline

id: KONT-WP-0005-T004
status: done
priority: high
state_hub_task_id: "c86e24ee-7e3f-488d-a649-d17a8689f0af"

Add actor and authorization context models for humans, applications, automation, service accounts, and AI agents.

Acceptance:

  • Operations accept explicit actor context.
  • Role, group, sensitivity, lifecycle, source-policy, and operation type can participate in policy checks.
  • Ambiguous permission state fails closed by contract.

G5.5 - Implement audit events correlation IDs and structured errors

id: KONT-WP-0005-T005
status: done
priority: high
state_hub_task_id: "3d2e98a1-3312-452a-a5f1-f7a73234b45b"

Create audit and correctness primitives for material operations.

Acceptance:

  • Asset create, ingest, update, delete/retire, metadata, relationship, permission, query, transformation, workflow, export, and agent operations can emit audit events through the shared audit primitives as those operation services land.
  • Structured errors include code, message, correlation ID, operation, and remediation hint where practical.
  • Partial failures are represented for batch operations.

Implemented registry baseline:

  • Registry mutations emit correlated audit events with success, denied, and partial outcomes where applicable.
  • OperationFailure, BatchItemResult, and BatchOperationResult provide the reusable structured error and batch envelope primitives.
  • Metadata batch updates return per-item diagnostics, preserve successful writes, skip failed writes, and emit a final batch audit event with counts and failed item IDs.

G5.6 - Implement durable SQLite repository for registry state

id: KONT-WP-0005-T006
status: done
priority: high
state_hub_task_id: "de155d02-3123-42da-8ede-f111bec62747"

Implement a local-first durable backend for assets, representations, metadata, classifications, relationships, actors, policies, audit events, and versions.

Acceptance:

  • State survives repository re-instantiation.
  • Referential integrity is enforced for assets, relationships, representations, versions, and audit references.
  • The in-memory backend remains useful for deterministic unit tests.

Implemented registry baseline:

  • SQLite persists assets, representations, metadata records, metadata schemas, schema assignments, context entities, relationships, versions, audit events, idempotency records, and ingestion jobs.
  • SQLite reload tests cover asset state, relationships, context entities, idempotency, schema assignments, metadata filters, ingestion jobs, and batch partial audit state.
  • Direct durable reference failures for versions, audit actors, and ingestion job actors raise structured ValidationError diagnostics instead of leaking raw SQLite integrity errors.

G5.7 - Implement versioning change history conflict and idempotency semantics

id: KONT-WP-0005-T007
status: done
priority: medium
state_hub_task_id: "5288b136-05c1-449c-9215-f8b34db8b274"

Add version and change history semantics for asset content, metadata, relationships, policy-relevant lifecycle state, and repeated requests.

Acceptance:

  • Updates create traceable change records.
  • Restore creates a new auditable change rather than erasing history.
  • Idempotency keys and conflict detection prevent unintended duplicate or stale writes where harmful.

Definition Of Done

  • Asset lifecycle tests cover create, retrieve, update, retire, delete request, metadata changes, permission checks, audit events, and durable reload.
  • New models map to the V0.2 FRS vocabulary.
  • The implemented package shape follows docs/architecture-blueprint.md or documents any deliberate deviation.
  • python3 -m pytest passes.