generated from coulomb/repo-seed
Aligns the v1 architecture with the longer-horizon platform thesis so we can start implementation without the schema-level inconsistencies the prior review surfaced. ADRs (docs/adr/0001..0006): content-addressed dual-digest storage, append-only event log as source of truth, canonical CBOR manifests, control/data-plane contract, v1 tech stack (Python 3.12 / uv / FastAPI / SQLAlchemy Core + asyncpg / Alembic / cbor2 / blake3 / ruff / mypy / pytest / typer), OCI compatibility kept reachable. Architecture blueprint rewritten to v2: library-first (ffmpeg-shaped) module layout, materialised-view data model over the event log, upload-session and event-stream endpoints pinned, retrieval tiering promoted into the schema. Roadmap added (docs/ROADMAP.md) with three phases. WP-0001 rewritten as the Foundation plan (scaffold + kernels + local FS + minimal app). WP-0002..0005 created carrying the existing state_hub_task_ids forward semantically: ingestion API (T004), retention lifecycle (T005), S3-compatible backend (T006), guide-board pilot (T007). T001/T002/T003/T008 remain in WP-0001 with refined acceptance. README and AGENTS.md refreshed to reflect the new repo shape. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
132 lines
3.5 KiB
Markdown
132 lines
3.5 KiB
Markdown
---
|
|
id: ARTIFACT-STORE-WP-0004
|
|
type: workplan
|
|
title: "S3-Compatible Backend (Ceph RGW Target)"
|
|
repo: artifact-store
|
|
domain: stack
|
|
status: planned
|
|
owner: codex
|
|
topic_slug: stack
|
|
planning_priority: medium
|
|
planning_order: 4
|
|
created: "2026-05-15"
|
|
updated: "2026-05-15"
|
|
---
|
|
|
|
# ARTIFACT-STORE-WP-0004: S3-Compatible Backend
|
|
|
|
## Purpose
|
|
|
|
Add a second concrete storage backend that speaks the S3 protocol.
|
|
Validated targets: Ceph RGW (primary self-hosted production target),
|
|
MinIO (dev / CI), AWS S3 (interop check). The backend must satisfy
|
|
the storage SPI without any leaks of S3-specific concepts into the
|
|
registry.
|
|
|
|
## Constraints
|
|
|
|
- `storage.spi.StorageBackend` Protocol from WP-0001 is the contract.
|
|
- No S3 vocabulary leaks into `registry.*` or `api.*`.
|
|
- `docs/ARCHITECTURE-BLUEPRINT.md` storage-backend section.
|
|
|
|
## Prerequisites
|
|
|
|
- WP-0001 done (SPI exists, local backend exists as a reference).
|
|
|
|
## D4.1 - Configuration Surface
|
|
|
|
```task
|
|
id: ARTIFACT-STORE-WP-0004-T001
|
|
status: todo
|
|
priority: high
|
|
state_hub_task_id: "7b980a55-2364-48c3-98ac-081629a8d2b7"
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
- `s3` backend configuration accepts: `endpoint_url`, `region`,
|
|
`bucket`, `key_prefix`, `access_key_ref`, `secret_key_ref`,
|
|
`storage_class`, `sse` (optional), `multipart_threshold_bytes`,
|
|
`multipart_chunk_bytes`.
|
|
- Credential references resolve from env vars or mounted files; never
|
|
from request bodies.
|
|
- Documented Ceph RGW configuration example checked in under
|
|
`docs/OPERATOR.md`.
|
|
|
|
## D4.2 - S3 Backend Implementation
|
|
|
|
```task
|
|
id: ARTIFACT-STORE-WP-0004-T002
|
|
status: todo
|
|
priority: high
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
- `storage.backends.s3.S3Backend` implements the SPI using `aioboto3`
|
|
or `aiobotocore` (decision recorded in the workplan; whichever is
|
|
better-maintained at implementation time).
|
|
- Object key layout
|
|
`<key_prefix>/<digest_algorithm>/<hex[0:2]>/<hex[2:4]>/<hex>`.
|
|
- `put` uses multipart for objects above the configured threshold.
|
|
- `get` supports `Range`.
|
|
- `head`, `delete`, `health` implemented.
|
|
- `delete` is idempotent (delete-of-missing returns success).
|
|
|
|
## D4.3 - Backend Selection And Routing
|
|
|
|
```task
|
|
id: ARTIFACT-STORE-WP-0004-T003
|
|
status: todo
|
|
priority: medium
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
- A registry can have multiple backends configured; package creation
|
|
records which backend a file is stored in.
|
|
- Per-package backend selection rule: configurable function of
|
|
`retention_class` + producer; default routes everything to a single
|
|
backend.
|
|
- `storage_locations.backend_id` reflects the actual storage.
|
|
|
|
## D4.4 - Test Strategy: MinIO In CI, RGW As Documented Manual Smoke
|
|
|
|
```task
|
|
id: ARTIFACT-STORE-WP-0004-T004
|
|
status: todo
|
|
priority: high
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
- Integration tests run against MinIO via `testcontainers-python`
|
|
(or a docker-compose fixture if testcontainers fights the WSL2
|
|
environment).
|
|
- A documented manual procedure tests against a real Ceph RGW
|
|
endpoint; results recorded in `docs/OPERATOR.md`.
|
|
- No CI dependency on a live Ceph or AWS account.
|
|
|
|
## D4.5 - Verification Pass
|
|
|
|
```task
|
|
id: ARTIFACT-STORE-WP-0004-T005
|
|
status: todo
|
|
priority: medium
|
|
```
|
|
|
|
Acceptance:
|
|
|
|
- `artifactstore storage verify --backend s3` re-reads every object in
|
|
the backend, recomputes its primary digest, and emits
|
|
`v1.storage.location_verified` events.
|
|
- Mismatches are reported as `failed` locations and surfaced via the
|
|
health endpoint.
|
|
|
|
## Success criteria
|
|
|
|
- The same package ingestion flow that worked against `local` in
|
|
WP-0001 works unchanged against `s3`.
|
|
- Switching backend by config — without code changes in the registry
|
|
or API layers — is the smoke test.
|