generated from coulomb/repo-seed
docs+plans: reconcile blueprint with ambition, add ADRs, sequence workplans
Aligns the v1 architecture with the longer-horizon platform thesis so we can start implementation without the schema-level inconsistencies the prior review surfaced. ADRs (docs/adr/0001..0006): content-addressed dual-digest storage, append-only event log as source of truth, canonical CBOR manifests, control/data-plane contract, v1 tech stack (Python 3.12 / uv / FastAPI / SQLAlchemy Core + asyncpg / Alembic / cbor2 / blake3 / ruff / mypy / pytest / typer), OCI compatibility kept reachable. Architecture blueprint rewritten to v2: library-first (ffmpeg-shaped) module layout, materialised-view data model over the event log, upload-session and event-stream endpoints pinned, retrieval tiering promoted into the schema. Roadmap added (docs/ROADMAP.md) with three phases. WP-0001 rewritten as the Foundation plan (scaffold + kernels + local FS + minimal app). WP-0002..0005 created carrying the existing state_hub_task_ids forward semantically: ingestion API (T004), retention lifecycle (T005), S3-compatible backend (T006), guide-board pilot (T007). T001/T002/T003/T008 remain in WP-0001 with refined acceptance. README and AGENTS.md refreshed to reflect the new repo shape. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: ARTIFACT-STORE-WP-0001
|
||||
type: workplan
|
||||
title: "Artifact Store Service Baseline"
|
||||
title: "Foundation: Scaffold, Core Kernels, Local FS Backend"
|
||||
repo: artifact-store
|
||||
domain: stack
|
||||
status: active
|
||||
@@ -14,51 +14,53 @@ updated: "2026-05-15"
|
||||
state_hub_workstream_id: "aebf996c-8721-4e8c-9e56-61d5e4bf8dcb"
|
||||
---
|
||||
|
||||
# ARTIFACT-STORE-WP-0001: Artifact Store Service Baseline
|
||||
# ARTIFACT-STORE-WP-0001: Foundation — Scaffold, Core Kernels, Local FS Backend
|
||||
|
||||
## Purpose
|
||||
|
||||
Implement the first usable artifact registry and storage gateway. The service
|
||||
should preserve artifact packages, index their metadata, delegate bytes to a
|
||||
configured storage backend, apply default retention rules, and expose stable
|
||||
package identifiers that Statehub and producer repositories can link to.
|
||||
Stand up the smallest credible `artifact-store` core. By the end of
|
||||
this workplan, the library can ingest a directory of files into a
|
||||
package, compute dual digests, write canonical-CBOR manifests, persist
|
||||
state through the append-only event log, store bytes on local
|
||||
filesystem, and replay materialised views from the event log. No HTTP
|
||||
API yet (that lands in WP-0002); a `/health` endpoint exists so that
|
||||
the dev loop has something to hit.
|
||||
|
||||
The first producer target is a guide-board assessment run, including OpenCMIS TCK
|
||||
reports and raw assessment artifacts.
|
||||
The shape is **library-first** (ffmpeg-style). HTTP server and CLI are
|
||||
explicitly thin consumers of `artifactstore.registry`.
|
||||
|
||||
## Background
|
||||
## Constraints (must satisfy)
|
||||
|
||||
Guide-board can already produce self-contained run directories with retention
|
||||
summaries, assessment packages, raw artifacts, scorecards, and log reviews. Those
|
||||
directories should not live only in `/tmp`, and committing raw evidence into
|
||||
producer repositories is the wrong long-term shape.
|
||||
|
||||
`artifact-store` becomes the shared preservation layer:
|
||||
|
||||
- producers generate files,
|
||||
- artifact-store registers and stores them,
|
||||
- Statehub records the work outcome and links to the registry package,
|
||||
- storage backends handle durable bytes.
|
||||
|
||||
Ceph is the likely self-hosted production backend through its S3-compatible RGW
|
||||
interface, but the service must keep the backend interface generic.
|
||||
|
||||
## Target Architecture
|
||||
|
||||
```text
|
||||
producer package
|
||||
-> registry API
|
||||
-> metadata database
|
||||
-> retention policy engine
|
||||
-> storage adapter
|
||||
-> local filesystem or S3-compatible object storage
|
||||
```
|
||||
- ADR-0001 — content-addressed storage with dual digest.
|
||||
- ADR-0002 — append-only event log as source of truth.
|
||||
- ADR-0003 — manifest canonicalisation = canonical CBOR.
|
||||
- ADR-0004 — control plane / data plane SPI named.
|
||||
- ADR-0005 — v1 technology stack pinned (Python 3.12, uv, FastAPI,
|
||||
SQLAlchemy Core, asyncpg, alembic, cbor2, blake3, ruff, mypy, pytest).
|
||||
- ADR-0006 — OCI compatibility kept reachable.
|
||||
- `docs/ARCHITECTURE-BLUEPRINT.md` data model and module layout.
|
||||
|
||||
## Boundary
|
||||
|
||||
This workplan owns the first service implementation and API contract. It does
|
||||
not need to build a UI, implement cold-storage restore tiers, replace Statehub,
|
||||
or provide formal records-management certification.
|
||||
This workplan builds the library and a minimal `/health` endpoint. It
|
||||
does NOT implement: package CRUD HTTP API (WP-0002), retention rules
|
||||
beyond the seed (WP-0003), S3-compatible backend (WP-0004), guide-board
|
||||
producer wiring (WP-0005), GC of unreferenced bytes (WP-0006).
|
||||
|
||||
## Target architecture (this workplan)
|
||||
|
||||
```text
|
||||
artifactstore (library)
|
||||
identity ──┐
|
||||
manifest ──┼──> registry (orchestrator) ──> events (WAL + views)
|
||||
events ───┘ │
|
||||
retention (seed only) └──> dataplane.spi ──> dataplane.inproc ──> storage.spi ──> storage.backends.local
|
||||
audit (view) └──> filesystem
|
||||
storage.spi
|
||||
dataplane.spi + inproc
|
||||
api.http (just /health)
|
||||
cli (just `artifactstore version`, `artifactstore migrate`, `artifactstore replay`)
|
||||
```
|
||||
|
||||
## D1.1 - Service Scaffold And Repository Identity
|
||||
|
||||
@@ -71,14 +73,71 @@ state_hub_task_id: "84209430-ec3b-4c5e-924e-019c25434230"
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Replace the seed README with artifact-store service instructions.
|
||||
- Add a Python service scaffold with a clear package/module layout.
|
||||
- Provide a local development command.
|
||||
- Provide a test command.
|
||||
- Keep generated artifact bytes and local databases ignored by git.
|
||||
- Document required environment variables.
|
||||
- `pyproject.toml` with `hatchling` build backend, pinned dependencies
|
||||
per ADR-0005.
|
||||
- `uv.lock` committed.
|
||||
- `Makefile` exposes: `make dev`, `make test`, `make lint`, `make
|
||||
type`, `make migrate`. Each target is a thin shim, no logic inline.
|
||||
- `src/artifactstore/` package skeleton matches ADR-0005's layout
|
||||
(empty `__init__.py` and one placeholder module per top-level
|
||||
concern: `identity`, `manifest`, `events`, `retention`, `audit`,
|
||||
`storage`, `dataplane`, `registry`, `api/http`, `cli`, `config`).
|
||||
- `tests/{unit,integration}/conftest.py` in place.
|
||||
- `.env.example` documents required environment variables:
|
||||
`ARTIFACTSTORE_DATABASE_URL`, `ARTIFACTSTORE_STORAGE_LOCAL_ROOT`,
|
||||
`ARTIFACTSTORE_LOG_LEVEL`.
|
||||
- CI-equivalent local commands: `make lint && make type && make test`
|
||||
pass on a clean checkout.
|
||||
- `README.md` replaces the seed README: install with `uv sync`, run
|
||||
with `make dev`, test with `make test`, links to ADRs and blueprint.
|
||||
|
||||
## D1.2 - Registry Data Model
|
||||
## D1.2 - Digest Abstraction And Content Address
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T009
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `identity.Digest` value type with `algorithm: str` and `hex: str`,
|
||||
immutable, hashable.
|
||||
- `identity.ContentAddress` — string-form `<algorithm>:<hex>` with
|
||||
validating parser and emitter.
|
||||
- `identity.digest_stream(reader) -> {primary: Digest, sha256: Digest}` —
|
||||
single-pass dual-hash over an `AsyncIterator[bytes]`. Default primary
|
||||
algorithm: `blake3`.
|
||||
- Algorithm registry with `blake3` and `sha256` registered at import.
|
||||
- Property test: digest over random byte sequences round-trips through
|
||||
serialisation; `sha256` matches `hashlib.sha256(...).hexdigest()`;
|
||||
`blake3` matches `blake3.blake3(...).hexdigest()`.
|
||||
|
||||
## D1.3 - Manifest Codec (Canonical CBOR + JCS Projection)
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T010
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `manifest.Manifest` dataclass with the v1 fields enumerated in the
|
||||
blueprint (`manifest_version=1`, package, files, storage_receipts,
|
||||
retention_summary, provenance).
|
||||
- `manifest.codec.encode(m) -> bytes` produces canonical CBOR
|
||||
(RFC 8949 §4.2.2): definite-length, shortest-form integers,
|
||||
sorted map keys.
|
||||
- `manifest.codec.decode(b) -> Manifest`.
|
||||
- `manifest.projection.jcs(m) -> bytes` produces RFC 8785 canonical
|
||||
JSON.
|
||||
- Property test: `decode(encode(m)) == m` for randomly-generated
|
||||
manifests; `encode(decode(jcs_to_cbor(jcs(m)))) == encode(m)`.
|
||||
- Manifest digest helper: `manifest_digest(m) -> ContentAddress` using
|
||||
BLAKE3 over the canonical CBOR bytes.
|
||||
|
||||
## D1.4 - Registry Data Model And Migrations
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T002
|
||||
@@ -89,16 +148,44 @@ state_hub_task_id: "e5249a39-46a2-4b56-813e-0339c52cd14e"
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Define persistent models for artifact packages, files, storage locations,
|
||||
retention rules, retention events, and audit events.
|
||||
- Store package metadata as structured JSON while keeping core query fields
|
||||
explicit.
|
||||
- Record package lifecycle status: created, uploading, finalized, deleted, and
|
||||
failed.
|
||||
- Record file `sha256`, size, media type, and logical relative path.
|
||||
- Add migrations or a reproducible schema initialization path.
|
||||
- Alembic configured with `migrations/` directory; `alembic upgrade
|
||||
head` works against both SQLite (dev) and PostgreSQL (prod).
|
||||
- `events`, `artifact_packages`, `artifact_files`, `storage_locations`,
|
||||
`retention_classes`, `retention_state`, `metadata_schemas` tables
|
||||
match the blueprint schema.
|
||||
- Seed migration populates `retention_classes` with the five v1 entries.
|
||||
- A `make migrate` and `make migrate-fresh` target work end-to-end on
|
||||
a clean DB.
|
||||
- All schema columns required by ADR-0001 (`digest_algorithm`,
|
||||
`digest_primary`, `digest_sha256`, `content_address`), ADR-0002
|
||||
(full `events` table), and the blueprint's `retrieval_tier` and
|
||||
`restore_status` are present.
|
||||
|
||||
## D1.3 - Local Filesystem Storage Backend
|
||||
## D1.5 - Event Log Persistence And Replay
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T011
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `events.write(transaction, Event)` writes one row in the given DB
|
||||
transaction. Sequence numbers are assigned by the DB
|
||||
(`BIGSERIAL`) and are guaranteed monotonic and gapless within a
|
||||
registry instance.
|
||||
- `events.tail(since_sequence) -> AsyncIterator[Event]` long-polls
|
||||
the table (notify-style on PostgreSQL via `LISTEN/NOTIFY`,
|
||||
poll-style on SQLite).
|
||||
- `events.replay(into=ViewWriter)` rebuilds all materialised view
|
||||
tables from `events` deterministically.
|
||||
- Test: ingesting a fixed sequence of events, then rebuilding the
|
||||
views from scratch, yields byte-identical materialised state.
|
||||
- Event payloads use canonical CBOR (`manifest.codec`) so the same
|
||||
bytes flow through registry → DB → tail consumer without re-encoding.
|
||||
|
||||
## D1.6 - Storage Adapter SPI And Local Filesystem Backend
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T003
|
||||
@@ -109,90 +196,81 @@ state_hub_task_id: "68f9a752-0012-4cc1-8768-ec3f75295e7a"
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Implement a storage adapter interface.
|
||||
- Implement a local filesystem backend for development and tests.
|
||||
- Store objects under deterministic package/file keys.
|
||||
- Prevent path traversal and accidental writes outside the configured storage
|
||||
root.
|
||||
- Add backend health reporting.
|
||||
- Add tests for put, get, head, and delete operations.
|
||||
- `storage.spi.StorageBackend` Protocol matches the blueprint.
|
||||
- `storage.backends.local.LocalBackend` implements the SPI:
|
||||
- Object key layout `<root>/<algo>/<hex[0:2]>/<hex[2:4]>/<hex>`.
|
||||
- Atomic write via `fsync(tmpfile) + rename`.
|
||||
- Path traversal rejected at the SPI boundary.
|
||||
- `health()` returns disk usage and root accessibility.
|
||||
- Backend registry resolves by `backend_id` string (per ADR-0004).
|
||||
- Unit tests cover: put, get, head, delete, double-put idempotency,
|
||||
delete-of-missing, range read.
|
||||
|
||||
## D1.4 - Package Ingestion API
|
||||
## D1.7 - Data Plane SPI And In-Process Implementation
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T004
|
||||
id: ARTIFACT-STORE-WP-0001-T012
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "e3879111-4be9-4731-8aea-15abb874f960"
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Add endpoints to create a package, upload files, finalize a package, retrieve
|
||||
package metadata, list packages, and download files.
|
||||
- Compute file hashes server-side during ingestion.
|
||||
- Reject duplicate logical paths within one package unless explicitly replacing
|
||||
a non-finalized file.
|
||||
- Produce a package manifest after finalization.
|
||||
- Add API tests covering successful ingestion and validation failures.
|
||||
- `dataplane.spi.DataPlane` Protocol matches ADR-0004.
|
||||
- `dataplane.inproc.InProcessDataPlane` implements all five operations
|
||||
on top of a configured `StorageBackend`.
|
||||
- `ingest_stream` computes both digests in a single pass, writes to
|
||||
the backend keyed by the primary content address, and returns an
|
||||
`IngestResult` containing both digests, size, and the
|
||||
`StorageReceipt`.
|
||||
- `serve_object` and `verify_object` re-read bytes through the
|
||||
backend; `verify_object` re-digests and returns mismatches if any.
|
||||
- Lint rule (or test): no code outside `dataplane.*` imports
|
||||
`storage.backends.*` directly.
|
||||
|
||||
## D1.5 - Retention Baseline
|
||||
## D1.8 - Registry Orchestrator (Library Surface)
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T005
|
||||
id: ARTIFACT-STORE-WP-0001-T013
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "2d6cbd83-c348-45ad-a223-7870a3412225"
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Seed default retention classes for transient, raw-evidence, summary-evidence,
|
||||
release-evidence, and permanent-record.
|
||||
- Apply a default `expires_at` when a package is created or finalized.
|
||||
- Add endpoints to extend retention and apply or release holds.
|
||||
- Record retention changes as retention events and audit events.
|
||||
- Expose deletion eligibility without deleting bytes automatically in the first
|
||||
implementation.
|
||||
- `registry.Registry` exposes: `create_package`, `ingest_file`,
|
||||
`finalize_package`, `get_manifest_bytes` (CBOR + JCS), `get_file`,
|
||||
`tail_events`. Plus stubs for the retention operations that lighten
|
||||
WP-0003.
|
||||
- Each mutating operation is one DB transaction that writes events
|
||||
AND updates materialised views.
|
||||
- Finalisation writes one `v1.package.finalized` event whose payload
|
||||
*is* the canonical CBOR manifest, and stamps `manifest_digest` on
|
||||
`artifact_packages`.
|
||||
- Duplicate `relative_path` within one not-yet-finalised package is
|
||||
rejected unless an explicit replace is requested.
|
||||
- Integration test: end-to-end ingest of a 3-file package against
|
||||
local backend → finalize → read manifest → verify digests
|
||||
→ tail events → replay rebuilds identical state.
|
||||
|
||||
## D1.6 - S3-Compatible Backend Design Hook
|
||||
## D1.9 - Minimal HTTP App And CLI
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T006
|
||||
id: ARTIFACT-STORE-WP-0001-T014
|
||||
status: todo
|
||||
priority: medium
|
||||
state_hub_task_id: "7b980a55-2364-48c3-98ac-081629a8d2b7"
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Define configuration fields for an S3-compatible backend.
|
||||
- Keep the adapter contract compatible with Ceph RGW.
|
||||
- Add an implementation stub or feature-flagged backend if dependencies are not
|
||||
ready.
|
||||
- Document expected Ceph/S3 configuration without requiring a live Ceph service
|
||||
for baseline tests.
|
||||
- `api.http.app` is a FastAPI app with one route: `GET /health`
|
||||
reporting registry liveness, DB connectivity, and backend health.
|
||||
- `cli` exposes `artifactstore version`, `artifactstore migrate`,
|
||||
`artifactstore replay`, `artifactstore health`.
|
||||
- `make dev` starts the API on `127.0.0.1:8000` with SQLite +
|
||||
local FS backend by default.
|
||||
|
||||
## D1.7 - Guide-Board Pilot Ingestion
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T007
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "eb822821-353c-4cd2-95bf-acb2f084b7ea"
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Provide a CLI helper or documented curl flow to register a guide-board run
|
||||
directory as one package.
|
||||
- Preserve guide-board run metadata: run id, target profile, assessment profile,
|
||||
evidence result counts, finding counts, source commits, and report paths.
|
||||
- Ingest the CMIS pilot run shape, including scorecard and log-review reports.
|
||||
- Return a package id suitable for recording in Statehub.
|
||||
- Add a fixture-based test that does not require the real OpenCMIS TCK.
|
||||
|
||||
## D1.8 - Operator Documentation And Handoff
|
||||
## D1.10 - Operator Documentation And ADR Cross-Linking
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0001-T008
|
||||
@@ -203,27 +281,33 @@ state_hub_task_id: "9b60036c-61f2-4c22-ad31-7213473d42d0"
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Document local run, test, and package ingestion commands.
|
||||
- Document retention behavior and extension flow.
|
||||
- Document the boundary between artifact-store and Statehub.
|
||||
- Include a dev-agent handoff section listing the first implementation order.
|
||||
- Keep architecture docs aligned with the implemented API.
|
||||
- `README.md` updated with current run / test / migrate commands.
|
||||
- `AGENTS.md` "Current Repo Shape" section reflects the scaffold.
|
||||
- An `docs/OPERATOR.md` page documents environment variables, local
|
||||
vs PostgreSQL setup, replay command, and a smoke-test recipe.
|
||||
- Every ADR is cross-linked from at least one of: blueprint, this
|
||||
workplan, or `OPERATOR.md`.
|
||||
|
||||
## Suggested Implementation Order
|
||||
## Suggested implementation order
|
||||
|
||||
1. Service scaffold, test harness, and README.
|
||||
2. Metadata models and local database setup.
|
||||
3. Local filesystem storage adapter.
|
||||
4. Package create/upload/finalize/download API.
|
||||
5. Retention defaults, extension, hold, and audit events.
|
||||
6. Guide-board run ingestion helper.
|
||||
7. S3-compatible backend configuration and Ceph notes.
|
||||
1. T001 — scaffold and tooling (no other task can start without this).
|
||||
2. T009 — digest abstraction (unblocks T010, T012).
|
||||
3. T010 — manifest codec (unblocks T013).
|
||||
4. T002 — schema and migrations (unblocks T011, T013).
|
||||
5. T011 — event log + replay.
|
||||
6. T003 — storage SPI + local backend.
|
||||
7. T012 — data plane SPI + in-process impl.
|
||||
8. T013 — registry orchestrator.
|
||||
9. T014 — minimal HTTP app and CLI.
|
||||
10. T008 — docs.
|
||||
|
||||
## First Pilot Success Criteria
|
||||
## Success criteria
|
||||
|
||||
- A completed guide-board CMIS run can be ingested from a local directory.
|
||||
- The package manifest lists every stored file with SHA-256 and size.
|
||||
- The registry returns a stable package id.
|
||||
- Files can be downloaded through the service.
|
||||
- Default retention is visible and can be extended.
|
||||
- Statehub can record the package id and summary without storing artifact bytes.
|
||||
- `make dev && make test` round-trips on a clean checkout.
|
||||
- A scripted integration test ingests a directory of fixture files,
|
||||
finalises the package, reads the manifest, downloads each file, and
|
||||
verifies digests end-to-end against the local backend.
|
||||
- Replaying events from sequence 1 reproduces the materialised view
|
||||
state byte-for-byte.
|
||||
- The library can be imported and exercised without an HTTP server
|
||||
running (embedding test).
|
||||
|
||||
150
workplans/ARTIFACT-STORE-WP-0002-ingestion-api.md
Normal file
150
workplans/ARTIFACT-STORE-WP-0002-ingestion-api.md
Normal file
@@ -0,0 +1,150 @@
|
||||
---
|
||||
id: ARTIFACT-STORE-WP-0002
|
||||
type: workplan
|
||||
title: "Ingestion API And Manifest Surface"
|
||||
repo: artifact-store
|
||||
domain: stack
|
||||
status: planned
|
||||
owner: codex
|
||||
topic_slug: stack
|
||||
planning_priority: high
|
||||
planning_order: 2
|
||||
created: "2026-05-15"
|
||||
updated: "2026-05-15"
|
||||
---
|
||||
|
||||
# ARTIFACT-STORE-WP-0002: Ingestion API And Manifest Surface
|
||||
|
||||
## Purpose
|
||||
|
||||
Expose the WP-0001 library as a complete HTTP API. Producers can create
|
||||
packages, ingest files (single-shot or via the upload-session resource
|
||||
shape), finalise to produce a manifest, list and search packages,
|
||||
download files, and tail the event stream.
|
||||
|
||||
## Constraints
|
||||
|
||||
- ADR-0001, ADR-0002, ADR-0003, ADR-0004, ADR-0005, ADR-0006.
|
||||
- `docs/ARCHITECTURE-BLUEPRINT.md` API shape section.
|
||||
- All handlers must be thin: translate transport → `registry.*` calls.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- WP-0001 done (library is functional against local backend).
|
||||
|
||||
## D2.1 - Package CRUD Endpoints
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0002-T001
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "e3879111-4be9-4731-8aea-15abb874f960"
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `POST /packages`, `GET /packages` (filterable by producer / subject /
|
||||
retention_class / metadata key), `GET /packages/{id}`,
|
||||
`POST /packages/{id}/files` (single-shot multipart),
|
||||
`POST /packages/{id}/finalize`.
|
||||
- `GET /packages/{id}/manifest` (`Accept: application/cbor`) and
|
||||
`GET /packages/{id}/manifest.json` (JCS projection).
|
||||
- Validation errors return RFC 7807 problem documents.
|
||||
- OpenAPI is generated automatically (FastAPI default) and served at
|
||||
`/openapi.json` + `/docs`.
|
||||
|
||||
## D2.2 - File Download And Range Reads
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0002-T002
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `GET /files/{file_id}` returns metadata.
|
||||
- `GET /files/{file_id}/download` streams bytes; supports `Range`
|
||||
request headers (single contiguous range; multi-range is out of
|
||||
scope for v1).
|
||||
- ETag is the file's primary content address; `If-None-Match` returns
|
||||
`304`.
|
||||
- Streaming uses `AsyncIterator[bytes]` end-to-end; no full-file
|
||||
buffering.
|
||||
|
||||
## D2.3 - Upload Session Resource (Wire Shape Pinned)
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0002-T003
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `POST /uploads` opens a session, returns an upload id and content
|
||||
upload URL.
|
||||
- `PATCH /uploads/{upload_id}` accepts a body with `Content-Range`;
|
||||
v1 implementation may accept the whole body in one call.
|
||||
- `POST /uploads/{upload_id}/complete` promotes the upload into a
|
||||
file under a given package id and relative path.
|
||||
- Implementation is allowed to be single-shot internally; the wire
|
||||
shape and resource lifecycle must be the final one (per
|
||||
PLATFORM-AMBITION A6).
|
||||
|
||||
## D2.4 - Event Stream Long-Poll
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0002-T004
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `GET /events?since=<sequence>&limit=N` returns events in order with
|
||||
a long-poll wait when the tail is reached.
|
||||
- Events are CBOR by default; `Accept: application/json` returns the
|
||||
JCS projection of each event payload.
|
||||
- Test: a consumer that tails from sequence 1 never misses an event
|
||||
produced during the test.
|
||||
|
||||
## D2.5 - Auth Scaffolding (Shared-Secret Bearer)
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0002-T005
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Bearer token auth on all mutating endpoints; configurable per-tenant
|
||||
token list via env / config file.
|
||||
- Read endpoints are also gated by default; an explicit
|
||||
`ARTIFACTSTORE_ANON_READ=true` opt-in for dev.
|
||||
- Health endpoint remains anonymous.
|
||||
|
||||
## D2.6 - Integration Tests Through The Full HTTP Surface
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0002-T006
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- httpx-based test suite exercises every endpoint.
|
||||
- A scripted test ingests a 50-file package, finalises it, downloads
|
||||
every file, verifies digests, and tails events.
|
||||
- A property-based test fuzzes the upload session lifecycle.
|
||||
|
||||
## Success criteria
|
||||
|
||||
- A producer can run the full ingest-and-retrieve flow against
|
||||
`make dev` with curl.
|
||||
- All blueprint endpoints in the v1 native surface are implemented.
|
||||
- The CLI gains `artifactstore push <dir>` and
|
||||
`artifactstore manifest <package_id>` subcommands as thin clients
|
||||
over the HTTP API.
|
||||
132
workplans/ARTIFACT-STORE-WP-0003-retention-lifecycle.md
Normal file
132
workplans/ARTIFACT-STORE-WP-0003-retention-lifecycle.md
Normal file
@@ -0,0 +1,132 @@
|
||||
---
|
||||
id: ARTIFACT-STORE-WP-0003
|
||||
type: workplan
|
||||
title: "Retention Lifecycle: Defaults, Extensions, Holds, Deletion Eligibility"
|
||||
repo: artifact-store
|
||||
domain: stack
|
||||
status: planned
|
||||
owner: codex
|
||||
topic_slug: stack
|
||||
planning_priority: high
|
||||
planning_order: 3
|
||||
created: "2026-05-15"
|
||||
updated: "2026-05-15"
|
||||
---
|
||||
|
||||
# ARTIFACT-STORE-WP-0003: Retention Lifecycle
|
||||
|
||||
## Purpose
|
||||
|
||||
Implement the retention engine. By the end of this workplan, every
|
||||
package has a computed `expires_at`, operators can extend retention or
|
||||
apply / release holds, and the system can mark expired packages as
|
||||
eligible for deletion — without actually deleting bytes (GC is
|
||||
WP-0006).
|
||||
|
||||
## Constraints
|
||||
|
||||
- ADR-0002 (every retention change is an event).
|
||||
- `docs/ARCHITECTURE-BLUEPRINT.md` retention sections.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- WP-0001 done (`retention_classes` seeded, `retention_state` view
|
||||
exists).
|
||||
- WP-0002 done (HTTP surface exists to attach the new endpoints to).
|
||||
|
||||
## D3.1 - Default Retention Application
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0003-T001
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "2d6cbd83-c348-45ad-a223-7870a3412225"
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- On `POST /packages`, the requested `retention_class` is validated
|
||||
and the `v1.retention.default_applied` event is written with the
|
||||
computed `expires_at`.
|
||||
- Default durations per class are operator-configurable via a
|
||||
config file (TOML); the file path is documented in `OPERATOR.md`.
|
||||
- `permanent-record` packages have `expires_at = NULL` and
|
||||
`eligible_for_deletion = false`.
|
||||
|
||||
## D3.2 - Retention Extensions
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0003-T002
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `POST /packages/{id}/retention/extensions` accepts
|
||||
`{new_expires_at, reason}`. The new value must be strictly later
|
||||
than the current; reason is mandatory.
|
||||
- Each extension writes a `v1.retention.extended` event;
|
||||
`retention_state.current_expires_at` updates on the same
|
||||
transaction.
|
||||
- A package's full extension history is recoverable from `events`.
|
||||
|
||||
## D3.3 - Holds (Apply And Release)
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0003-T003
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `POST /packages/{id}/retention/holds` records a hold with a reason
|
||||
and actor; emits `v1.retention.hold_applied`.
|
||||
- A package with at least one active hold is never
|
||||
`eligible_for_deletion` regardless of `expires_at`.
|
||||
- `POST /packages/{id}/retention/holds/{hold_id}/release` requires a
|
||||
reason; emits `v1.retention.hold_released`.
|
||||
- Test: hold applied → expiry passes → eligibility stays `false`;
|
||||
hold released → eligibility flips to `true`.
|
||||
|
||||
## D3.4 - Deletion Eligibility Sweeper
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0003-T004
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- A scheduled task (cron-style configurable interval; default 1 hour)
|
||||
scans packages whose `expires_at` has passed and no active hold
|
||||
exists, and emits `v1.retention.deletion_eligible` events.
|
||||
- The sweeper is idempotent: events are emitted at most once per
|
||||
package per eligibility transition.
|
||||
- The sweeper is invokable as a CLI subcommand for tests:
|
||||
`artifactstore retention sweep`.
|
||||
|
||||
## D3.5 - Audit Surface For Retention
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0003-T005
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `GET /packages/{id}/retention/history` returns the ordered list of
|
||||
retention events for a package.
|
||||
- The default response is the JCS projection; CBOR is available via
|
||||
`Accept: application/cbor`.
|
||||
|
||||
## Success criteria
|
||||
|
||||
- A guide-board run can be ingested, given `release-evidence`, later
|
||||
extended once, held for a quarter, released, swept, and marked
|
||||
eligible — all visible through both `retention_state` and the
|
||||
event log.
|
||||
- No bytes are deleted by this workplan; that is WP-0006.
|
||||
131
workplans/ARTIFACT-STORE-WP-0004-s3-compatible-backend.md
Normal file
131
workplans/ARTIFACT-STORE-WP-0004-s3-compatible-backend.md
Normal file
@@ -0,0 +1,131 @@
|
||||
---
|
||||
id: ARTIFACT-STORE-WP-0004
|
||||
type: workplan
|
||||
title: "S3-Compatible Backend (Ceph RGW Target)"
|
||||
repo: artifact-store
|
||||
domain: stack
|
||||
status: planned
|
||||
owner: codex
|
||||
topic_slug: stack
|
||||
planning_priority: medium
|
||||
planning_order: 4
|
||||
created: "2026-05-15"
|
||||
updated: "2026-05-15"
|
||||
---
|
||||
|
||||
# ARTIFACT-STORE-WP-0004: S3-Compatible Backend
|
||||
|
||||
## Purpose
|
||||
|
||||
Add a second concrete storage backend that speaks the S3 protocol.
|
||||
Validated targets: Ceph RGW (primary self-hosted production target),
|
||||
MinIO (dev / CI), AWS S3 (interop check). The backend must satisfy
|
||||
the storage SPI without any leaks of S3-specific concepts into the
|
||||
registry.
|
||||
|
||||
## Constraints
|
||||
|
||||
- `storage.spi.StorageBackend` Protocol from WP-0001 is the contract.
|
||||
- No S3 vocabulary leaks into `registry.*` or `api.*`.
|
||||
- `docs/ARCHITECTURE-BLUEPRINT.md` storage-backend section.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- WP-0001 done (SPI exists, local backend exists as a reference).
|
||||
|
||||
## D4.1 - Configuration Surface
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0004-T001
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "7b980a55-2364-48c3-98ac-081629a8d2b7"
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `s3` backend configuration accepts: `endpoint_url`, `region`,
|
||||
`bucket`, `key_prefix`, `access_key_ref`, `secret_key_ref`,
|
||||
`storage_class`, `sse` (optional), `multipart_threshold_bytes`,
|
||||
`multipart_chunk_bytes`.
|
||||
- Credential references resolve from env vars or mounted files; never
|
||||
from request bodies.
|
||||
- Documented Ceph RGW configuration example checked in under
|
||||
`docs/OPERATOR.md`.
|
||||
|
||||
## D4.2 - S3 Backend Implementation
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0004-T002
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `storage.backends.s3.S3Backend` implements the SPI using `aioboto3`
|
||||
or `aiobotocore` (decision recorded in the workplan; whichever is
|
||||
better-maintained at implementation time).
|
||||
- Object key layout
|
||||
`<key_prefix>/<digest_algorithm>/<hex[0:2]>/<hex[2:4]>/<hex>`.
|
||||
- `put` uses multipart for objects above the configured threshold.
|
||||
- `get` supports `Range`.
|
||||
- `head`, `delete`, `health` implemented.
|
||||
- `delete` is idempotent (delete-of-missing returns success).
|
||||
|
||||
## D4.3 - Backend Selection And Routing
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0004-T003
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- A registry can have multiple backends configured; package creation
|
||||
records which backend a file is stored in.
|
||||
- Per-package backend selection rule: configurable function of
|
||||
`retention_class` + producer; default routes everything to a single
|
||||
backend.
|
||||
- `storage_locations.backend_id` reflects the actual storage.
|
||||
|
||||
## D4.4 - Test Strategy: MinIO In CI, RGW As Documented Manual Smoke
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0004-T004
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Integration tests run against MinIO via `testcontainers-python`
|
||||
(or a docker-compose fixture if testcontainers fights the WSL2
|
||||
environment).
|
||||
- A documented manual procedure tests against a real Ceph RGW
|
||||
endpoint; results recorded in `docs/OPERATOR.md`.
|
||||
- No CI dependency on a live Ceph or AWS account.
|
||||
|
||||
## D4.5 - Verification Pass
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0004-T005
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `artifactstore storage verify --backend s3` re-reads every object in
|
||||
the backend, recomputes its primary digest, and emits
|
||||
`v1.storage.location_verified` events.
|
||||
- Mismatches are reported as `failed` locations and surfaced via the
|
||||
health endpoint.
|
||||
|
||||
## Success criteria
|
||||
|
||||
- The same package ingestion flow that worked against `local` in
|
||||
WP-0001 works unchanged against `s3`.
|
||||
- Switching backend by config — without code changes in the registry
|
||||
or API layers — is the smoke test.
|
||||
146
workplans/ARTIFACT-STORE-WP-0005-guide-board-pilot.md
Normal file
146
workplans/ARTIFACT-STORE-WP-0005-guide-board-pilot.md
Normal file
@@ -0,0 +1,146 @@
|
||||
---
|
||||
id: ARTIFACT-STORE-WP-0005
|
||||
type: workplan
|
||||
title: "Guide-Board Pilot Ingestion"
|
||||
repo: artifact-store
|
||||
domain: stack
|
||||
status: planned
|
||||
owner: codex
|
||||
topic_slug: stack
|
||||
planning_priority: high
|
||||
planning_order: 5
|
||||
created: "2026-05-15"
|
||||
updated: "2026-05-15"
|
||||
---
|
||||
|
||||
# ARTIFACT-STORE-WP-0005: Guide-Board Pilot Ingestion
|
||||
|
||||
## Purpose
|
||||
|
||||
Wire the first real producer end-to-end. A guide-board CMIS
|
||||
assessment run directory is registered as one artifact package, its
|
||||
files are stored through a configured backend, retention is applied,
|
||||
and Statehub records a stable package id and summary without storing
|
||||
bytes itself. This is the pilot success criterion in INTENT.md.
|
||||
|
||||
## Constraints
|
||||
|
||||
- WP-0001 — WP-0004 must be done.
|
||||
- `docs/ARCHITECTURE-BLUEPRINT.md` guide-board manifest fields.
|
||||
- No guide-board-specific code lives in `artifactstore.registry`;
|
||||
pilot-specific glue lives in `artifactstore.pilots.guide_board` or
|
||||
in a separate small package.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- WP-0001, WP-0002, WP-0003 done. WP-0004 only required for the
|
||||
production target; local FS is sufficient for the pilot test.
|
||||
|
||||
## D5.1 - Pilot Metadata Schema Registration
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T001
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "eb822821-353c-4cd2-95bf-acb2f084b7ea"
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- A JSON Schema for `guide-board.run.v1` package metadata is checked
|
||||
in under `schemas/guide-board.run.v1.json`.
|
||||
- A bootstrap script registers it via `POST /metadata-schemas`
|
||||
(an endpoint added in this workplan).
|
||||
- Required keys: `run_id`, `target_profile_ref`,
|
||||
`assessment_profile_ref`, `result_status`, `source_commits`
|
||||
(object of slug → SHA), `report_paths`, `evidence_counts`,
|
||||
`finding_counts`.
|
||||
|
||||
## D5.2 - Pilot Ingest Helper (CLI + Library Function)
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T002
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `artifactstore guide-board ingest <run-dir>` walks a guide-board
|
||||
run directory, builds the package metadata from `run.json` and
|
||||
`retention-summary.json`, uploads every file declared in the
|
||||
assessment package manifest (and the manifest itself), and
|
||||
finalises the package.
|
||||
- Library entry point `pilots.guide_board.ingest_run(path, ...)`
|
||||
exposes the same behaviour for embedding.
|
||||
- Output: the package id (UUID) and the package manifest digest
|
||||
(`blake3:<hex>`).
|
||||
|
||||
## D5.3 - Fixture-Based Test
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T003
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- A trimmed-down guide-board run fixture (under 1 MB total) lives in
|
||||
`tests/fixtures/guide-board/` with realistic file shapes:
|
||||
`run.json`, `retention-summary.json`,
|
||||
`reports/assessment-package.json`, `reports/report.md`, one
|
||||
scorecard, one log-review summary, and a couple of raw artifact
|
||||
files.
|
||||
- The test runs the CLI / library helper end-to-end against an
|
||||
in-memory SQLite + tempdir local backend, then verifies:
|
||||
1. package id returned,
|
||||
2. manifest digest stable across two runs of the same fixture,
|
||||
3. every file downloadable with correct bytes,
|
||||
4. retention class applied as configured.
|
||||
|
||||
## D5.4 - Statehub Linkage Recipe
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T004
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- `docs/OPERATOR.md` (or a new `docs/pilots/guide-board.md`)
|
||||
documents the exact `POST /progress/` or `record_decision` call
|
||||
shape Statehub clients should use to link a guide-board run to
|
||||
its artifact-store package id and manifest digest.
|
||||
- A reference Statehub client snippet is checked in, parameterised
|
||||
by env vars.
|
||||
|
||||
## D5.5 - Operator Smoke Procedure For The Real Producer
|
||||
|
||||
```task
|
||||
id: ARTIFACT-STORE-WP-0005-T005
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Acceptance:
|
||||
|
||||
- A documented procedure ingests a real (non-fixture) guide-board run
|
||||
produced from `~/guide-board` / `~/open-cmis-tck`.
|
||||
- Procedure includes: starting `make dev`, registering the schema,
|
||||
running the ingest CLI, verifying the manifest, and
|
||||
recording the package id in Statehub.
|
||||
- Procedure runs end-to-end on a developer workstation under 5
|
||||
minutes.
|
||||
|
||||
## Success criteria
|
||||
|
||||
- A real guide-board CMIS run is ingested with one CLI invocation.
|
||||
- The package manifest lists every stored file with both digests and
|
||||
the canonical CBOR digest of the manifest itself.
|
||||
- Statehub records the package id and summary; no artifact bytes
|
||||
live in Statehub.
|
||||
- Retention can be extended on the package without touching bytes.
|
||||
- The pilot path validates the storage adapter swap: the same
|
||||
command works against `local` and against `s3` (if WP-0004 done).
|
||||
Reference in New Issue
Block a user