Files
kontextual-engine/docs/asset-registry-implementation.md

138 lines
5.6 KiB
Markdown

# Asset Registry Implementation Note
Date: 2026-05-06
Status: active implementation note for `KONT-WP-0005`.
## Purpose
This note records the first governed asset registry implementation built on the
architecture core. It establishes the service/repository boundary needed before
durable ingestion, retrieval, transformation, and agent operations depend on
asset state.
## Implemented Package Shape
```text
src/kontextual_engine/
ports/
policy.py
repositories.py
services/
asset_service.py
adapters/
memory/asset_registry.py
sqlite/asset_registry.py
```
The service depends on engine-owned ports and domain core contracts. The memory
and SQLite repositories are adapters behind those ports.
## Implemented Capabilities
- Stable `KnowledgeAsset` creation with explicit source references.
- Separate source, normalized, and derived `AssetRepresentation` records.
- `MetadataRecord` persistence with inferred/confirmed semantics preserved.
- Custom metadata schema primitives with structured validation issues.
- Metadata schema validation before asset create and metadata update writes.
- Durable metadata schema registry and assignment rules for policy-selected
validation.
- Asset listing filters for lifecycle, asset type, sensitivity, owner, topic,
review state, metadata record values, and confirmed-only metadata.
- Actor and `OperationContext` required for material mutations.
- Policy gateway authorization before asset mutations.
- Fail-closed policy denial through `AuthorizationError`.
- Audit events for create, metadata update, representation update, lifecycle
transition, and denied mutations.
- SQLite actor references for audit events and ingestion jobs are enforced with
structured validation errors.
- Structured operation failures with code, message, operation, correlation ID,
details, and remediation hints where practical.
- Metadata batch updates with compact per-item success/failure envelopes and a
final `success`, `failed`, or `partial` batch audit event.
- Asset version records for create, content/representation changes, metadata
changes, and lifecycle changes.
- Optimistic `expected_current_version_id` conflict checks on stale-sensitive
asset mutations.
- Append-only asset restore operations that create new auditable versions.
- Asset supersession operations that create `superseded_by` relationships,
retire the source asset by default, and record a supersession version.
- Context entity persistence.
- Relationship persistence for asset-to-asset and asset-to-context-entity
links.
- Relationship changes create source-asset version records and audit events.
- Idempotency records for safe asset creation retries.
- Idempotency-key reuse with a different payload raises a validation error.
- Transformation run and derived lineage persistence for traceable derived
artifact creation.
- In-memory repository for deterministic tests.
- SQLite repository for local-first durable asset registry state.
- SQLite foreign-key enforcement for representation and metadata asset
references.
- SQLite durable reference checks for asset versions, audit actors, ingestion
job actors, metadata schema assignments, and relationship targets.
## Current SQLite Tables
- `actors`
- `assets`
- `representations`
- `metadata_records`
- `metadata_schemas`
- `metadata_schema_assignments`
- `context_entities`
- `core_relationships`
- `asset_versions`
- `audit_events`
- `retrieval_feedback`
- `idempotency_records`
- `ingestion_jobs`
- `transformation_runs`
- `derived_lineage`
Payloads are stored as compact JSON envelopes while indexed columns carry
stable lookup fields such as asset ID, lifecycle, representation kind, digest,
sequence, relationship source/target, actor ID, target, correlation ID,
idempotency key, transformation status, operation ID, and derived output asset
ID.
## Not Yet Implemented
Enterprise policy adapters and richer policy-assignment language remain
adjacent enterprise-readiness work. The registry persists policy decisions in
audit payloads and policy references in metadata schema assignments, but policy
evaluation itself remains behind the `PolicyGateway` port.
Conflict detection is implemented through service-level optimistic version
guards. Broader multi-writer locking or transaction isolation semantics remain
backend-specific future work if concurrent production writers require it.
These are intentionally left to adjacent enterprise, concurrency, or
production-backend workplans rather than this registry foundation slice.
## Test Coverage
`tests/test_asset_registry.py` covers:
- asset creation with source reference, representation, metadata, version, and
audit output,
- lifecycle denial with fail-closed policy and denied audit event,
- SQLite reload preserving asset lifecycle, representation, metadata, versions,
and audit history,
- SQLite referential integrity for representation asset references,
- SQLite durable reference integrity for versions, audit actors, and ingestion
job actors,
- idempotent asset creation and conflicting idempotency-key reuse,
- relationship creation with source-asset versioning and audit,
- SQLite reload preserving context entities, relationships, and idempotency
records,
- custom metadata schema validation before registry writes,
- persistent metadata schema registry and assignment reload behavior,
- classification and metadata-record asset filtering across memory and SQLite
repositories.
- optimistic version conflict checks on asset mutations,
- restore and supersession as append-only versioned operations,
- metadata batch partial-failure envelopes with structured item diagnostics and
partial audit events,
- SQLite reload of metadata batch partial audit state.