Files
artifact-store/docs/adr/0002-event-log-source-of-truth.md
tegwick 747afc27a6 docs+plans: reconcile blueprint with ambition, add ADRs, sequence workplans
Aligns the v1 architecture with the longer-horizon platform thesis so we can
start implementation without the schema-level inconsistencies the prior
review surfaced.

ADRs (docs/adr/0001..0006): content-addressed dual-digest storage, append-only
event log as source of truth, canonical CBOR manifests, control/data-plane
contract, v1 tech stack (Python 3.12 / uv / FastAPI / SQLAlchemy Core +
asyncpg / Alembic / cbor2 / blake3 / ruff / mypy / pytest / typer), OCI
compatibility kept reachable.

Architecture blueprint rewritten to v2: library-first (ffmpeg-shaped) module
layout, materialised-view data model over the event log, upload-session and
event-stream endpoints pinned, retrieval tiering promoted into the schema.

Roadmap added (docs/ROADMAP.md) with three phases. WP-0001 rewritten as the
Foundation plan (scaffold + kernels + local FS + minimal app). WP-0002..0005
created carrying the existing state_hub_task_ids forward semantically:
ingestion API (T004), retention lifecycle (T005), S3-compatible backend
(T006), guide-board pilot (T007). T001/T002/T003/T008 remain in WP-0001
with refined acceptance.

README and AGENTS.md refreshed to reflect the new repo shape.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 21:16:17 +02:00

77 lines
3.2 KiB
Markdown

# ADR-0002 — Append-Only Event Log as Source of Truth
Status: accepted
Date: 2026-05-15
Related: `docs/PLATFORM-AMBITION.md` commitment A3
## Context
The original blueprint defines `audit_events` and `retention_events` as
separate tables. Both are useful, but neither is a complete authoritative
record of how registry state was produced. Several downstream needs share
one underlying primitive:
- audit (who did what when, with what result),
- change-data-capture feed for downstream consumers (Statehub, search),
- replication and federation between instances,
- point-in-time replay and disaster recovery,
- materialised view rebuilds when schemas evolve.
Each can be served by an append-only log of registry events with a
monotonic sequence number. Two separate tables cannot.
## Decision
1. The registry persists an append-only `events` table. Every state-
changing operation writes one row in the same database transaction as
the operation. Once written, rows are immutable.
2. Each row has a strictly monotonic, gapless sequence number scoped to
the registry instance, and a UTC ingest timestamp.
3. The current `artifact_packages`, `artifact_files`, `storage_locations`,
and `retention_state` tables are materialised views over `events`.
They are rebuildable by replay.
4. Event payloads are stored as canonical CBOR (ADR-0003), keyed by
`event_type` (string slug). The `event_type` namespace is versioned
(`v1.package.created`, `v1.file.ingested`, `v1.retention.extended`,
etc.).
5. `audit_events` and `retention_events` cease to exist as standalone
tables; their semantics are subsets of `events` filtered by
`event_type`.
## Consequences
Positive:
- One primitive serves audit, CDC, replication, replay, and rebuild.
- A consumer can tail by `sequence > N` and never miss an event.
- Forward-compatibility: new view columns can be derived from existing
events by adding a replay path; no migration required.
- Signed event chains are reachable later by adding a signature column.
Negative:
- Replays cost wall-clock time on large datasets. Snapshots of
materialised views (with the highest applied sequence stamped on them)
are used to bound replay cost.
- Schema migrations on materialised views still happen; they just no
longer touch the source of truth.
- Discipline required: any write that bypasses the event log is a bug.
Enforced by code review and a runtime invariant check on the
materialised tables.
## Implementation notes
- `events` schema (v1):
- `sequence BIGSERIAL PRIMARY KEY`
- `created_at TIMESTAMPTZ NOT NULL DEFAULT now()`
- `event_type TEXT NOT NULL`
- `subject_kind TEXT NOT NULL``package` | `file` | `retention` | `storage` | `system`
- `subject_id UUID` — nullable for system-level events
- `actor TEXT NOT NULL` — producer or operator identity
- `payload BYTEA NOT NULL` — canonical CBOR
- `payload_digest BYTEA NOT NULL` — BLAKE3 of `payload`
- Indexes: `(subject_kind, subject_id)`, `(event_type, sequence)`.
- Replay tool ships in v1 as a CLI subcommand (`artifactstore replay`).
- Outbound CDC stream (NATS / Kafka) is its own workplan; v1 only exposes
long-poll over `GET /events?since=<sequence>`.