# ADR-0002 — Append-Only Event Log as Source of Truth Status: accepted Date: 2026-05-15 Related: `docs/PLATFORM-AMBITION.md` commitment A3 ## Context The original blueprint defines `audit_events` and `retention_events` as separate tables. Both are useful, but neither is a complete authoritative record of how registry state was produced. Several downstream needs share one underlying primitive: - audit (who did what when, with what result), - change-data-capture feed for downstream consumers (Statehub, search), - replication and federation between instances, - point-in-time replay and disaster recovery, - materialised view rebuilds when schemas evolve. Each can be served by an append-only log of registry events with a monotonic sequence number. Two separate tables cannot. ## Decision 1. The registry persists an append-only `events` table. Every state- changing operation writes one row in the same database transaction as the operation. Once written, rows are immutable. 2. Each row has a strictly monotonic, gapless sequence number scoped to the registry instance, and a UTC ingest timestamp. 3. The current `artifact_packages`, `artifact_files`, `storage_locations`, and `retention_state` tables are materialised views over `events`. They are rebuildable by replay. 4. Event payloads are stored as canonical CBOR (ADR-0003), keyed by `event_type` (string slug). The `event_type` namespace is versioned (`v1.package.created`, `v1.file.ingested`, `v1.retention.extended`, etc.). 5. `audit_events` and `retention_events` cease to exist as standalone tables; their semantics are subsets of `events` filtered by `event_type`. ## Consequences Positive: - One primitive serves audit, CDC, replication, replay, and rebuild. - A consumer can tail by `sequence > N` and never miss an event. - Forward-compatibility: new view columns can be derived from existing events by adding a replay path; no migration required. - Signed event chains are reachable later by adding a signature column. Negative: - Replays cost wall-clock time on large datasets. Snapshots of materialised views (with the highest applied sequence stamped on them) are used to bound replay cost. - Schema migrations on materialised views still happen; they just no longer touch the source of truth. - Discipline required: any write that bypasses the event log is a bug. Enforced by code review and a runtime invariant check on the materialised tables. ## Implementation notes - `events` schema (v1): - `sequence BIGSERIAL PRIMARY KEY` - `created_at TIMESTAMPTZ NOT NULL DEFAULT now()` - `event_type TEXT NOT NULL` - `subject_kind TEXT NOT NULL` — `package` | `file` | `retention` | `storage` | `system` - `subject_id UUID` — nullable for system-level events - `actor TEXT NOT NULL` — producer or operator identity - `payload BYTEA NOT NULL` — canonical CBOR - `payload_digest BYTEA NOT NULL` — BLAKE3 of `payload` - Indexes: `(subject_kind, subject_id)`, `(event_type, sequence)`. - Replay tool ships in v1 as a CLI subcommand (`artifactstore replay`). - Outbound CDC stream (NATS / Kafka) is its own workplan; v1 only exposes long-poll over `GET /events?since=`.