Files
kontextual-engine/workplans/KONT-WP-0005-asset-registry-governance-state.md

245 lines
8.6 KiB
Markdown

---
id: KONT-WP-0005
type: workplan
title: "Asset Registry Governance And Durable State"
domain: markitect
repo: kontextual-engine
status: done
owner: codex
topic_slug: markitect
planning_priority: high
planning_order: 5
created: "2026-05-05"
updated: "2026-05-06"
state_hub_workstream_id: "231a7794-aa3b-4763-a556-80b4cea731c8"
---
# KONT-WP-0005: Asset Registry Governance And Durable State
## Purpose
Implement the governed knowledge asset registry that underpins the V0.2 product
vision: stable asset identity, source references, source/normalized/derived
representations, metadata, classification, lifecycle state, actors,
authorization checks, audit events, versioning, and durable local-first state.
## Requirement Coverage
Primary: FR-001 to FR-010, FR-040 to FR-049, FR-120 to FR-126,
FR-140 to FR-145, FR-240 to FR-245.
Supporting: FR-180 to FR-182, FR-200 to FR-201.
## Architecture Constraint
Implement this slice through the domain core, application services, repository
ports, policy port, audit port, and SQLite/in-memory adapters described in
`docs/architecture-blueprint.md`. The asset registry must not depend on HTTP,
source connectors, document extractors, search backends, or AI providers.
## markitect-tool Boundary Remark
The asset registry may persist Markitect snapshot IDs, parser metadata,
frontmatter-derived metadata, selector references, and operation provenance as
adapter metadata on representations or versions. It must not make Markitect
document classes canonical engine entities, and asset identity must remain
independent of Markitect snapshot identity.
Markdown proxy documents are valid source, normalized, or derived
representations for assets when Markitect selectors, contracts, document
schemas, or workflows are useful. They remain adapter representations under
engine governance; the registry still owns identity, metadata, lifecycle,
policy, lineage, and audit.
## Implementation Note
The first registry slice is recorded in
`docs/asset-registry-implementation.md`. It establishes repository ports,
memory and SQLite adapters, and the asset registry service for create,
metadata, representation, lifecycle, relationship, idempotency, policy, audit,
versions, and durable reload behavior.
## Implementation Status
As of 2026-05-06, the registry core has a working asset service, in-memory and
SQLite repositories, policy gateway boundary, audit events, versions,
representations, metadata records, context entities, asset/context
relationships, idempotent asset creation, and custom metadata schema
validation before registry writes. It now also includes a durable metadata
schema registry and assignment rules for policy-selected validation, structured
operation failures, metadata batch partial-failure envelopes, and durable
SQLite reference checks for versions, audit actors, ingestion job actors,
metadata schema assignments, and relationship targets. This foundation
workplan is complete; enterprise policy adapters, richer policy-assignment
language, and production concurrency controls are intentionally left to
adjacent workplans.
## G5.1 - Implement stable asset identity and source references
```task
id: KONT-WP-0005-T001
status: done
priority: high
state_hub_task_id: "7d61a11c-ca14-4075-ab0b-897bdfe57cb1"
```
Replace artifact-centric naming with knowledge asset identity that survives
rename, move, re-ingestion, representation changes, and transformation.
Acceptance:
- Assets have stable IDs, source references, source aliases, and content
digests.
- Source system, source path/URL/external ID, checksum, ingestion actor, and
ingestion time can be represented.
- Existing artifact tests are migrated or wrapped without losing deterministic
digest behavior.
## G5.2 - Represent source normalized and derived asset forms
```task
id: KONT-WP-0005-T002
status: done
priority: high
state_hub_task_id: "cd0a2b0a-a2a0-426e-8b8c-6013cd6b9303"
```
Introduce explicit representation records for original/source-near content,
normalized engine content, and derived artifacts.
Acceptance:
- Retrieval can distinguish source content from normalized content.
- Derived artifacts are stored as asset-linked records, not detached strings.
- Representation metadata includes media type, digest, size, extractor or
producer, and provenance.
- Markdown representation metadata can include serialized Markitect snapshot
identity without coupling engine identity to it.
## G5.3 - Implement metadata classification lifecycle and schema validation
```task
id: KONT-WP-0005-T003
status: done
priority: high
state_hub_task_id: "b06c5124-ce54-4241-b712-2fbab856877b"
```
Implement standard metadata, custom metadata schemas, classification,
sensitivity, lifecycle state, tags, ownership, and validation behavior.
Acceptance:
- Assets can be filtered by standard metadata and lifecycle state.
- Custom schema validation produces structured validation errors.
- Inferred and confirmed metadata can be distinguished for later review flows.
## G5.4 - Implement actor authorization and policy baseline
```task
id: KONT-WP-0005-T004
status: done
priority: high
state_hub_task_id: "c86e24ee-7e3f-488d-a649-d17a8689f0af"
```
Add actor and authorization context models for humans, applications,
automation, service accounts, and AI agents.
Acceptance:
- Operations accept explicit actor context.
- Role, group, sensitivity, lifecycle, source-policy, and operation type can
participate in policy checks.
- Ambiguous permission state fails closed by contract.
## G5.5 - Implement audit events correlation IDs and structured errors
```task
id: KONT-WP-0005-T005
status: done
priority: high
state_hub_task_id: "3d2e98a1-3312-452a-a5f1-f7a73234b45b"
```
Create audit and correctness primitives for material operations.
Acceptance:
- Asset create, ingest, update, delete/retire, metadata, relationship,
permission, query, transformation, workflow, export, and agent operations can
emit audit events through the shared audit primitives as those operation
services land.
- Structured errors include code, message, correlation ID, operation, and
remediation hint where practical.
- Partial failures are represented for batch operations.
Implemented registry baseline:
- Registry mutations emit correlated audit events with `success`, `denied`, and
`partial` outcomes where applicable.
- `OperationFailure`, `BatchItemResult`, and `BatchOperationResult` provide the
reusable structured error and batch envelope primitives.
- Metadata batch updates return per-item diagnostics, preserve successful
writes, skip failed writes, and emit a final batch audit event with counts and
failed item IDs.
## G5.6 - Implement durable SQLite repository for registry state
```task
id: KONT-WP-0005-T006
status: done
priority: high
state_hub_task_id: "de155d02-3123-42da-8ede-f111bec62747"
```
Implement a local-first durable backend for assets, representations, metadata,
classifications, relationships, actors, policies, audit events, and versions.
Acceptance:
- State survives repository re-instantiation.
- Referential integrity is enforced for assets, relationships, representations,
versions, and audit references.
- The in-memory backend remains useful for deterministic unit tests.
Implemented registry baseline:
- SQLite persists assets, representations, metadata records, metadata schemas,
schema assignments, context entities, relationships, versions, audit events,
idempotency records, and ingestion jobs.
- SQLite reload tests cover asset state, relationships, context entities,
idempotency, schema assignments, metadata filters, ingestion jobs, and batch
partial audit state.
- Direct durable reference failures for versions, audit actors, and ingestion
job actors raise structured `ValidationError` diagnostics instead of leaking
raw SQLite integrity errors.
## G5.7 - Implement versioning change history conflict and idempotency semantics
```task
id: KONT-WP-0005-T007
status: done
priority: medium
state_hub_task_id: "5288b136-05c1-449c-9215-f8b34db8b274"
```
Add version and change history semantics for asset content, metadata,
relationships, policy-relevant lifecycle state, and repeated requests.
Acceptance:
- Updates create traceable change records.
- Restore creates a new auditable change rather than erasing history.
- Idempotency keys and conflict detection prevent unintended duplicate or stale
writes where harmful.
## Definition Of Done
- Asset lifecycle tests cover create, retrieve, update, retire, delete request,
metadata changes, permission checks, audit events, and durable reload.
- New models map to the V0.2 FRS vocabulary.
- The implemented package shape follows `docs/architecture-blueprint.md` or
documents any deliberate deviation.
- `python3 -m pytest` passes.