generated from coulomb/repo-seed
Aligns the v1 architecture with the longer-horizon platform thesis so we can start implementation without the schema-level inconsistencies the prior review surfaced. ADRs (docs/adr/0001..0006): content-addressed dual-digest storage, append-only event log as source of truth, canonical CBOR manifests, control/data-plane contract, v1 tech stack (Python 3.12 / uv / FastAPI / SQLAlchemy Core + asyncpg / Alembic / cbor2 / blake3 / ruff / mypy / pytest / typer), OCI compatibility kept reachable. Architecture blueprint rewritten to v2: library-first (ffmpeg-shaped) module layout, materialised-view data model over the event log, upload-session and event-stream endpoints pinned, retrieval tiering promoted into the schema. Roadmap added (docs/ROADMAP.md) with three phases. WP-0001 rewritten as the Foundation plan (scaffold + kernels + local FS + minimal app). WP-0002..0005 created carrying the existing state_hub_task_ids forward semantically: ingestion API (T004), retention lifecycle (T005), S3-compatible backend (T006), guide-board pilot (T007). T001/T002/T003/T008 remain in WP-0001 with refined acceptance. README and AGENTS.md refreshed to reflect the new repo shape. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
118 lines
5.7 KiB
Markdown
118 lines
5.7 KiB
Markdown
# ADR-0005 — V1 Technology Stack
|
|
|
|
Status: accepted
|
|
Date: 2026-05-15
|
|
Related: ADR-0001, ADR-0002, ADR-0003, ADR-0004
|
|
|
|
## Context
|
|
|
|
WP-0001 ("Foundation") cannot start without a pinned stack. The decision
|
|
needs to balance:
|
|
|
|
- ffmpeg / VLC philosophy: minimal dependency budget, sharp boundaries,
|
|
native code at the hot edges, plain tools.
|
|
- Python is already implied by `.gitignore` and ecosystem fit (StateHub,
|
|
guide-board, open-cmis-tck are all Python-leaning).
|
|
- The data plane will eventually be Rust (ADR-0004); the control plane
|
|
stays in Python and must stay approachable.
|
|
|
|
## Decision
|
|
|
|
| Concern | Choice | Rationale |
|
|
|---|---|---|
|
|
| Language (control plane) | **Python 3.12+** | Async ecosystem, type hints, matches sibling repos. 3.12 specifically: PEP 695 generics, faster CPython, `sys.monitoring`. |
|
|
| Package / project manager | **uv** | Single static binary, fast resolver, lockfile-first, replaces `pip + pip-tools + venv + pipx` in one tool. |
|
|
| Build backend | **hatchling** (via `pyproject.toml`) | Standards-track PEP 517 backend. No magic. |
|
|
| HTTP framework | **FastAPI** (Starlette + Pydantic v2) | OpenAPI generation, async-native, broad community. |
|
|
| ASGI server | **uvicorn** (dev), **gunicorn + uvicorn workers** (prod) | Plain, well-understood. |
|
|
| Database (prod) | **PostgreSQL 16+** | Source-of-truth event log (ADR-0002) wants `BIGSERIAL`, `BYTEA`, advisory locks, logical replication. |
|
|
| Database (dev/embedded) | **SQLite (WAL mode)** | Zero-dependency local. Schema is portable when we use SQLAlchemy Core. |
|
|
| DB access | **SQLAlchemy 2.0 Core** + **asyncpg** (prod) / **aiosqlite** (dev) | Core, not ORM — explicit SQL, async drivers. Migrations live below the API surface. |
|
|
| Migrations | **Alembic** | Standard, integrates with SQLAlchemy Core, supports both pg and sqlite. |
|
|
| Hashing | stdlib **`hashlib`** for SHA-256, **`blake3`** PyPI wheel for BLAKE3 | `blake3` wheel embeds the SIMD-tuned Rust impl with no build-time toolchain. |
|
|
| Serialisation | **`cbor2`** for canonical CBOR (ADR-0003); stdlib `json` for JCS or `jcs` PyPI | Smallest deps that satisfy ADR-0003. |
|
|
| CLI | **typer** (atop click) | Sits on FastAPI's Pydantic types cleanly; type-driven CLI surface. |
|
|
| Tests | **pytest** + **httpx** + **trio-asyncio**-free `pytest-asyncio` | Standard. |
|
|
| Lint / format | **ruff** (lint + format) | One tool replaces black + isort + flake8 + pyupgrade. |
|
|
| Type checker | **mypy** in `--strict` | Pyright is acceptable for editor support; CI gate is mypy. |
|
|
| Logging | stdlib `logging` + `structlog` for structured output | No exotic deps. |
|
|
| Metrics / tracing | OpenTelemetry SDK (deferred to its own workplan) | Listed for forward-compatibility; not a v1 dep. |
|
|
|
|
### Project layout
|
|
|
|
```
|
|
artifact-store/
|
|
├── pyproject.toml
|
|
├── uv.lock
|
|
├── Makefile # thin shim: make dev / test / lint / type / migrate
|
|
├── alembic.ini
|
|
├── src/
|
|
│ └── artifactstore/
|
|
│ ├── __init__.py
|
|
│ ├── identity/ # content address, digest abstraction (ADR-0001)
|
|
│ ├── manifest/ # canonical CBOR, JCS projection (ADR-0003)
|
|
│ ├── events/ # append-only log + replayer (ADR-0002)
|
|
│ ├── retention/ # policy engine
|
|
│ ├── audit/ # audit emission as event subset
|
|
│ ├── storage/ # adapter SPI + backend registry
|
|
│ │ ├── spi.py
|
|
│ │ └── backends/
|
|
│ │ ├── local.py # filesystem backend
|
|
│ │ └── s3.py # placeholder, WP-0004
|
|
│ ├── dataplane/ # SPI + in-process impl (ADR-0004)
|
|
│ │ ├── spi.py
|
|
│ │ └── inproc.py
|
|
│ ├── registry/ # high-level orchestrator
|
|
│ ├── api/
|
|
│ │ └── http/ # FastAPI app
|
|
│ ├── cli/ # typer CLI (thin)
|
|
│ └── config.py
|
|
├── tests/
|
|
│ ├── unit/
|
|
│ ├── integration/
|
|
│ └── conftest.py
|
|
├── migrations/ # alembic
|
|
└── docs/
|
|
```
|
|
|
|
### Commands (T001 acceptance)
|
|
|
|
```
|
|
make dev # uvicorn with reload, sqlite backend, local FS storage
|
|
make test # pytest -q
|
|
make lint # ruff check + ruff format --check
|
|
make type # mypy --strict src tests
|
|
make migrate # alembic upgrade head
|
|
artifactstore # CLI entry point installed by uv
|
|
```
|
|
|
|
## Consequences
|
|
|
|
Positive:
|
|
|
|
- Dependency budget is small and each dep is best-in-class for its slot.
|
|
- The same toolchain works on Linux, macOS, and CI without special cases.
|
|
- `uv.lock` is checked in; builds are reproducible.
|
|
- Every layer maps one-to-one to a docs concept (identity, manifest,
|
|
events, dataplane, etc.), so the codebase remains navigable.
|
|
|
|
Negative:
|
|
|
|
- Pydantic v2 is the heaviest non-DB dep; acceptable for the OpenAPI win.
|
|
- Choosing SQLAlchemy Core over ORM costs some convenience; we accept
|
|
it because explicit SQL is easier to migrate to Rust later (ADR-0004).
|
|
- mypy `--strict` is a per-PR tax; bounded by keeping the codebase small.
|
|
|
|
## Revision policy
|
|
|
|
This ADR is the most likely candidate for revision once we have profile
|
|
data from real ingestion. Candidates we are already watching:
|
|
|
|
- Replace `cbor2` with a Rust-backed CBOR codec if profile shows it on
|
|
the hot path.
|
|
- Replace `uvicorn` with `granian` (Rust ASGI server) if perf demands.
|
|
- Replace `SQLAlchemy Core` with raw `asyncpg` + a tiny query builder
|
|
if Core's abstractions show up in flame graphs.
|
|
|
|
Each replacement is its own ADR. None of them are v1 work.
|