generated from coulomb/repo-seed
docs+plans: reconcile blueprint with ambition, add ADRs, sequence workplans
Aligns the v1 architecture with the longer-horizon platform thesis so we can start implementation without the schema-level inconsistencies the prior review surfaced. ADRs (docs/adr/0001..0006): content-addressed dual-digest storage, append-only event log as source of truth, canonical CBOR manifests, control/data-plane contract, v1 tech stack (Python 3.12 / uv / FastAPI / SQLAlchemy Core + asyncpg / Alembic / cbor2 / blake3 / ruff / mypy / pytest / typer), OCI compatibility kept reachable. Architecture blueprint rewritten to v2: library-first (ffmpeg-shaped) module layout, materialised-view data model over the event log, upload-session and event-stream endpoints pinned, retrieval tiering promoted into the schema. Roadmap added (docs/ROADMAP.md) with three phases. WP-0001 rewritten as the Foundation plan (scaffold + kernels + local FS + minimal app). WP-0002..0005 created carrying the existing state_hub_task_ids forward semantically: ingestion API (T004), retention lifecycle (T005), S3-compatible backend (T006), guide-board pilot (T007). T001/T002/T003/T008 remain in WP-0001 with refined acceptance. README and AGENTS.md refreshed to reflect the new repo shape. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
79
docs/adr/0004-control-plane-data-plane-contract.md
Normal file
79
docs/adr/0004-control-plane-data-plane-contract.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# ADR-0004 — Control Plane / Data Plane Contract
|
||||
|
||||
Status: accepted
|
||||
Date: 2026-05-15
|
||||
Related: ADR-0005, `docs/PLATFORM-AMBITION.md` commitment A5,
|
||||
`docs/ASSEMBLY-EXPERIMENT.md`
|
||||
|
||||
## Context
|
||||
|
||||
The platform ambition expects a Rust (eventually asm-tuned) data plane
|
||||
to handle hot ingest paths — hashing, chunking, optional compression and
|
||||
encryption, storage backend I/O. The v1 service is written entirely in
|
||||
Python (ADR-0005). The cost of conflating control and data planes at the
|
||||
code level is that extracting the data plane later requires API churn,
|
||||
test rework, and producer migrations.
|
||||
|
||||
The cost of separating them now is one named module boundary and one
|
||||
in-process protocol shape. That cost is essentially free if taken
|
||||
before any consumer exists.
|
||||
|
||||
## Decision
|
||||
|
||||
1. The Python package is organised so that *every byte-handling
|
||||
operation* lives behind a named contract:
|
||||
- `artifactstore.dataplane.spi` — the abstract surface (typed
|
||||
dataclasses, async iterator protocols).
|
||||
- `artifactstore.dataplane.inproc` — the v1 implementation, running
|
||||
in the same process as the control plane.
|
||||
2. The control plane (`artifactstore.registry`, `artifactstore.api.http`,
|
||||
`artifactstore.retention`, `artifactstore.audit`) interacts with
|
||||
bytes *only* through the SPI. No HTTP handler, no DB writer, no
|
||||
retention rule ever reads or writes file bytes directly.
|
||||
3. The SPI exposes exactly these operations:
|
||||
- `ingest_stream(stream, hints) -> IngestResult` — consumes an
|
||||
upload, returns content addresses, sizes, and storage receipts.
|
||||
- `serve_object(content_address, range?) -> AsyncIterator[bytes]` —
|
||||
produces bytes for a download.
|
||||
- `verify_object(content_address) -> VerifyResult` — re-reads bytes,
|
||||
re-digests, returns mismatches.
|
||||
- `delete_object(content_address) -> DeletionResult` — best-effort,
|
||||
idempotent.
|
||||
- `backend_health() -> BackendStatus` — readiness, latency, free
|
||||
capacity.
|
||||
4. The SPI surface is the contract a future Rust daemon must satisfy.
|
||||
When that daemon ships, `artifactstore.dataplane.inproc` is replaced
|
||||
by `artifactstore.dataplane.remote` (a thin gRPC or
|
||||
framed-bincode-over-Unix-socket client). The control plane sees no
|
||||
change.
|
||||
5. SPI parameter and return types are CBOR-serialisable today, even when
|
||||
nothing serialises them. This lets us toggle to RPC without rewriting
|
||||
types.
|
||||
|
||||
## Consequences
|
||||
|
||||
Positive:
|
||||
|
||||
- The data plane can be rewritten in Rust later with zero API churn.
|
||||
- Tests can fake the SPI cheaply; integration tests pin the contract.
|
||||
- The CLI in `artifactstore.cli` is a second consumer of the SPI on
|
||||
equal footing with the HTTP server.
|
||||
- Operators with strong embedding requirements can use the in-process
|
||||
data plane forever; nothing forces the RPC hop.
|
||||
|
||||
Negative:
|
||||
|
||||
- One extra abstraction layer in v1. Mitigated by the contract being
|
||||
narrow (five operations).
|
||||
- Discipline required: PRs that bypass the SPI are rejected. A linter
|
||||
rule (forbidden import: `artifactstore.api.* -> filesystem`) makes
|
||||
this mechanical.
|
||||
|
||||
## Implementation notes
|
||||
|
||||
- The SPI is a `Protocol` (typing.Protocol) in `dataplane/spi.py` so the
|
||||
in-process and future remote impls don't share an inheritance tree.
|
||||
- Streaming returns `AsyncIterator[bytes]` so neither full-file buffering
|
||||
nor `sendfile()` zero-copy is foreclosed.
|
||||
- The `IngestResult` payload is the canonical CBOR-able value used in
|
||||
events (ADR-0002). The same byte sequence flows API → SPI → event.
|
||||
Reference in New Issue
Block a user