generated from coulomb/repo-seed
Aligns the v1 architecture with the longer-horizon platform thesis so we can start implementation without the schema-level inconsistencies the prior review surfaced. ADRs (docs/adr/0001..0006): content-addressed dual-digest storage, append-only event log as source of truth, canonical CBOR manifests, control/data-plane contract, v1 tech stack (Python 3.12 / uv / FastAPI / SQLAlchemy Core + asyncpg / Alembic / cbor2 / blake3 / ruff / mypy / pytest / typer), OCI compatibility kept reachable. Architecture blueprint rewritten to v2: library-first (ffmpeg-shaped) module layout, materialised-view data model over the event log, upload-session and event-stream endpoints pinned, retrieval tiering promoted into the schema. Roadmap added (docs/ROADMAP.md) with three phases. WP-0001 rewritten as the Foundation plan (scaffold + kernels + local FS + minimal app). WP-0002..0005 created carrying the existing state_hub_task_ids forward semantically: ingestion API (T004), retention lifecycle (T005), S3-compatible backend (T006), guide-board pilot (T007). T001/T002/T003/T008 remain in WP-0001 with refined acceptance. README and AGENTS.md refreshed to reflect the new repo shape. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5.7 KiB
5.7 KiB
ADR-0005 — V1 Technology Stack
Status: accepted Date: 2026-05-15 Related: ADR-0001, ADR-0002, ADR-0003, ADR-0004
Context
WP-0001 ("Foundation") cannot start without a pinned stack. The decision needs to balance:
- ffmpeg / VLC philosophy: minimal dependency budget, sharp boundaries, native code at the hot edges, plain tools.
- Python is already implied by
.gitignoreand ecosystem fit (StateHub, guide-board, open-cmis-tck are all Python-leaning). - The data plane will eventually be Rust (ADR-0004); the control plane stays in Python and must stay approachable.
Decision
| Concern | Choice | Rationale |
|---|---|---|
| Language (control plane) | Python 3.12+ | Async ecosystem, type hints, matches sibling repos. 3.12 specifically: PEP 695 generics, faster CPython, sys.monitoring. |
| Package / project manager | uv | Single static binary, fast resolver, lockfile-first, replaces pip + pip-tools + venv + pipx in one tool. |
| Build backend | hatchling (via pyproject.toml) |
Standards-track PEP 517 backend. No magic. |
| HTTP framework | FastAPI (Starlette + Pydantic v2) | OpenAPI generation, async-native, broad community. |
| ASGI server | uvicorn (dev), gunicorn + uvicorn workers (prod) | Plain, well-understood. |
| Database (prod) | PostgreSQL 16+ | Source-of-truth event log (ADR-0002) wants BIGSERIAL, BYTEA, advisory locks, logical replication. |
| Database (dev/embedded) | SQLite (WAL mode) | Zero-dependency local. Schema is portable when we use SQLAlchemy Core. |
| DB access | SQLAlchemy 2.0 Core + asyncpg (prod) / aiosqlite (dev) | Core, not ORM — explicit SQL, async drivers. Migrations live below the API surface. |
| Migrations | Alembic | Standard, integrates with SQLAlchemy Core, supports both pg and sqlite. |
| Hashing | stdlib hashlib for SHA-256, blake3 PyPI wheel for BLAKE3 |
blake3 wheel embeds the SIMD-tuned Rust impl with no build-time toolchain. |
| Serialisation | cbor2 for canonical CBOR (ADR-0003); stdlib json for JCS or jcs PyPI |
Smallest deps that satisfy ADR-0003. |
| CLI | typer (atop click) | Sits on FastAPI's Pydantic types cleanly; type-driven CLI surface. |
| Tests | pytest + httpx + trio-asyncio-free pytest-asyncio |
Standard. |
| Lint / format | ruff (lint + format) | One tool replaces black + isort + flake8 + pyupgrade. |
| Type checker | mypy in --strict |
Pyright is acceptable for editor support; CI gate is mypy. |
| Logging | stdlib logging + structlog for structured output |
No exotic deps. |
| Metrics / tracing | OpenTelemetry SDK (deferred to its own workplan) | Listed for forward-compatibility; not a v1 dep. |
Project layout
artifact-store/
├── pyproject.toml
├── uv.lock
├── Makefile # thin shim: make dev / test / lint / type / migrate
├── alembic.ini
├── src/
│ └── artifactstore/
│ ├── __init__.py
│ ├── identity/ # content address, digest abstraction (ADR-0001)
│ ├── manifest/ # canonical CBOR, JCS projection (ADR-0003)
│ ├── events/ # append-only log + replayer (ADR-0002)
│ ├── retention/ # policy engine
│ ├── audit/ # audit emission as event subset
│ ├── storage/ # adapter SPI + backend registry
│ │ ├── spi.py
│ │ └── backends/
│ │ ├── local.py # filesystem backend
│ │ └── s3.py # placeholder, WP-0004
│ ├── dataplane/ # SPI + in-process impl (ADR-0004)
│ │ ├── spi.py
│ │ └── inproc.py
│ ├── registry/ # high-level orchestrator
│ ├── api/
│ │ └── http/ # FastAPI app
│ ├── cli/ # typer CLI (thin)
│ └── config.py
├── tests/
│ ├── unit/
│ ├── integration/
│ └── conftest.py
├── migrations/ # alembic
└── docs/
Commands (T001 acceptance)
make dev # uvicorn with reload, sqlite backend, local FS storage
make test # pytest -q
make lint # ruff check + ruff format --check
make type # mypy --strict src tests
make migrate # alembic upgrade head
artifactstore # CLI entry point installed by uv
Consequences
Positive:
- Dependency budget is small and each dep is best-in-class for its slot.
- The same toolchain works on Linux, macOS, and CI without special cases.
uv.lockis checked in; builds are reproducible.- Every layer maps one-to-one to a docs concept (identity, manifest, events, dataplane, etc.), so the codebase remains navigable.
Negative:
- Pydantic v2 is the heaviest non-DB dep; acceptable for the OpenAPI win.
- Choosing SQLAlchemy Core over ORM costs some convenience; we accept it because explicit SQL is easier to migrate to Rust later (ADR-0004).
- mypy
--strictis a per-PR tax; bounded by keeping the codebase small.
Revision policy
This ADR is the most likely candidate for revision once we have profile data from real ingestion. Candidates we are already watching:
- Replace
cbor2with a Rust-backed CBOR codec if profile shows it on the hot path. - Replace
uvicornwithgranian(Rust ASGI server) if perf demands. - Replace
SQLAlchemy Corewith rawasyncpg+ a tiny query builder if Core's abstractions show up in flame graphs.
Each replacement is its own ADR. None of them are v1 work.