Files
artifact-store/workplans/ARTIFACT-STORE-WP-0004-s3-compatible-backend.md
tegwick 747afc27a6 docs+plans: reconcile blueprint with ambition, add ADRs, sequence workplans
Aligns the v1 architecture with the longer-horizon platform thesis so we can
start implementation without the schema-level inconsistencies the prior
review surfaced.

ADRs (docs/adr/0001..0006): content-addressed dual-digest storage, append-only
event log as source of truth, canonical CBOR manifests, control/data-plane
contract, v1 tech stack (Python 3.12 / uv / FastAPI / SQLAlchemy Core +
asyncpg / Alembic / cbor2 / blake3 / ruff / mypy / pytest / typer), OCI
compatibility kept reachable.

Architecture blueprint rewritten to v2: library-first (ffmpeg-shaped) module
layout, materialised-view data model over the event log, upload-session and
event-stream endpoints pinned, retrieval tiering promoted into the schema.

Roadmap added (docs/ROADMAP.md) with three phases. WP-0001 rewritten as the
Foundation plan (scaffold + kernels + local FS + minimal app). WP-0002..0005
created carrying the existing state_hub_task_ids forward semantically:
ingestion API (T004), retention lifecycle (T005), S3-compatible backend
(T006), guide-board pilot (T007). T001/T002/T003/T008 remain in WP-0001
with refined acceptance.

README and AGENTS.md refreshed to reflect the new repo shape.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 21:16:17 +02:00

3.5 KiB

id, type, title, repo, domain, status, owner, topic_slug, planning_priority, planning_order, created, updated
id type title repo domain status owner topic_slug planning_priority planning_order created updated
ARTIFACT-STORE-WP-0004 workplan S3-Compatible Backend (Ceph RGW Target) artifact-store stack planned codex stack medium 4 2026-05-15 2026-05-15

ARTIFACT-STORE-WP-0004: S3-Compatible Backend

Purpose

Add a second concrete storage backend that speaks the S3 protocol. Validated targets: Ceph RGW (primary self-hosted production target), MinIO (dev / CI), AWS S3 (interop check). The backend must satisfy the storage SPI without any leaks of S3-specific concepts into the registry.

Constraints

  • storage.spi.StorageBackend Protocol from WP-0001 is the contract.
  • No S3 vocabulary leaks into registry.* or api.*.
  • docs/ARCHITECTURE-BLUEPRINT.md storage-backend section.

Prerequisites

  • WP-0001 done (SPI exists, local backend exists as a reference).

D4.1 - Configuration Surface

id: ARTIFACT-STORE-WP-0004-T001
status: todo
priority: high
state_hub_task_id: "7b980a55-2364-48c3-98ac-081629a8d2b7"

Acceptance:

  • s3 backend configuration accepts: endpoint_url, region, bucket, key_prefix, access_key_ref, secret_key_ref, storage_class, sse (optional), multipart_threshold_bytes, multipart_chunk_bytes.
  • Credential references resolve from env vars or mounted files; never from request bodies.
  • Documented Ceph RGW configuration example checked in under docs/OPERATOR.md.

D4.2 - S3 Backend Implementation

id: ARTIFACT-STORE-WP-0004-T002
status: todo
priority: high

Acceptance:

  • storage.backends.s3.S3Backend implements the SPI using aioboto3 or aiobotocore (decision recorded in the workplan; whichever is better-maintained at implementation time).
  • Object key layout <key_prefix>/<digest_algorithm>/<hex[0:2]>/<hex[2:4]>/<hex>.
  • put uses multipart for objects above the configured threshold.
  • get supports Range.
  • head, delete, health implemented.
  • delete is idempotent (delete-of-missing returns success).

D4.3 - Backend Selection And Routing

id: ARTIFACT-STORE-WP-0004-T003
status: todo
priority: medium

Acceptance:

  • A registry can have multiple backends configured; package creation records which backend a file is stored in.
  • Per-package backend selection rule: configurable function of retention_class + producer; default routes everything to a single backend.
  • storage_locations.backend_id reflects the actual storage.

D4.4 - Test Strategy: MinIO In CI, RGW As Documented Manual Smoke

id: ARTIFACT-STORE-WP-0004-T004
status: todo
priority: high

Acceptance:

  • Integration tests run against MinIO via testcontainers-python (or a docker-compose fixture if testcontainers fights the WSL2 environment).
  • A documented manual procedure tests against a real Ceph RGW endpoint; results recorded in docs/OPERATOR.md.
  • No CI dependency on a live Ceph or AWS account.

D4.5 - Verification Pass

id: ARTIFACT-STORE-WP-0004-T005
status: todo
priority: medium

Acceptance:

  • artifactstore storage verify --backend s3 re-reads every object in the backend, recomputes its primary digest, and emits v1.storage.location_verified events.
  • Mismatches are reported as failed locations and surfaced via the health endpoint.

Success criteria

  • The same package ingestion flow that worked against local in WP-0001 works unchanged against s3.
  • Switching backend by config — without code changes in the registry or API layers — is the smoke test.