content-addressed blob storage: blob_storage.py, memory, local, and S3 adapters

This commit is contained in:
2026-05-07 03:51:25 +02:00
parent c2bc7071d7
commit ebace73761
22 changed files with 1489 additions and 47 deletions

View File

@@ -190,7 +190,7 @@ Required MVP ports:
- Repository port for assets, representations, metadata, relationships,
versions, runs, audit events, and exports.
- Object/content store port for source, normalized, and derived content payloads.
- Blob/content store port for source, normalized, and derived content payloads.
- Search index port for lexical search and later semantic/hybrid retrieval.
- Extractor port for format-specific normalization.
- Connector port for source systems.
@@ -211,6 +211,10 @@ Adapter rules:
Markitect where useful, but they are not the canonical engine identity or
storage model. The canonical layer remains asset, representation, metadata,
lifecycle, policy, lineage, and audit state.
- Blob storage is infrastructure behind `AssetRepresentation.storage_ref`.
Whole-object content addressing, digest verification, and chunked byte
streaming belong behind the blob port. Local filesystem and S3 are adapters,
not different domain models.
- `llm-connect` or equivalent is an adapter for LLM providers.
- `phase-memory` is an adjacent memory runtime; this engine may exchange opaque
memory references or context packages but should not implement memory phases.
@@ -251,6 +255,9 @@ Recommended storage style:
adapter-specific payloads.
- Separate content/object references for large source, normalized, or derived
payloads.
- Store blob bytes outside repository rows when content is non-trivial. Keep
representation digest, size, media type, kind, producer, and storage ref in
the repository, and let blob adapters handle byte persistence and dedupe.
- Append-only audit events and change records.
- Deterministic ordering fields for pagination and tests.