Implement HTTP ingestion and retention lifecycle

This commit is contained in:
2026-05-16 23:10:21 +02:00
parent 2173f702c1
commit c33baa3635
15 changed files with 2478 additions and 69 deletions

View File

@@ -1,12 +1,10 @@
# Operator Guide
Status: v0.1 (WP-0001 baseline)
Status: v0.1 (WP-0003 baseline)
Updated: 2026-05-16
This guide is the user manual for running `artifact-store` v0.1 — the
library + CLI + minimal HTTP app that landed in WP-0001. Ingest, finalize,
and retrieve workflows go through the Python library today; the HTTP
upload API arrives in WP-0002.
library, CLI, HTTP ingestion API, manifest surface, and retention lifecycle.
For architectural background see
[ARCHITECTURE-BLUEPRINT.md](ARCHITECTURE-BLUEPRINT.md), the ADRs under
@@ -50,9 +48,42 @@ All settings are prefixed with ``ARTIFACTSTORE_`` and read by
| `ARTIFACTSTORE_DATABASE_URL` | `sqlite+aiosqlite:///./var/artifactstore.db` | SQLAlchemy async URL. Alembic translates `+aiosqlite` and `+asyncpg` to their sync drivers at migrate-time. |
| `ARTIFACTSTORE_STORAGE_LOCAL_ROOT`| `./var/storage` | Root directory for the local filesystem storage backend. Created on first use. |
| `ARTIFACTSTORE_LOG_LEVEL` | `INFO` | Python logging level (`DEBUG` / `INFO` / `WARNING` / `ERROR`). |
| `ARTIFACTSTORE_AUTH_TOKENS` | empty | Comma- or newline-separated shared-secret bearer tokens for the HTTP API. |
| `ARTIFACTSTORE_ANON_READ` | `false` | Set `true` only for local demos where read endpoints may be anonymous. |
| `ARTIFACTSTORE_API_URL` | `http://127.0.0.1:8000` | Default API base URL used by HTTP-backed CLI commands. |
| `ARTIFACTSTORE_API_TOKEN` | empty | Default bearer token used by HTTP-backed CLI commands. |
| `ARTIFACTSTORE_RETENTION_CONFIG_PATH` | empty | Optional TOML file overriding retention-class default durations. |
| `ARTIFACTSTORE_RETENTION_SWEEP_INTERVAL_SECONDS` | `3600` | Default interval for external schedulers that invoke the retention sweeper. |
See [`.env.example`](../.env.example) for the canonical template.
### Retention policy TOML
By default, retention durations come from the seeded `retention_classes`
rows. Operators can override the default duration per class with
`ARTIFACTSTORE_RETENTION_CONFIG_PATH`:
```toml
[retention_classes.transient]
default_duration_seconds = 86400
[retention_classes."raw-evidence"]
default_duration_seconds = 7776000
[retention_classes."summary-evidence"]
default_duration_seconds = 31536000
[retention_classes."release-evidence"]
default_duration_seconds = 220752000
[retention_classes."permanent-record"]
# Omit default_duration_seconds for no expiry.
```
Run `artifactstore retention sweep` from cron or another scheduler to mark
expired, unheld packages eligible for deletion. This work only records
eligibility; it never deletes bytes.
## Database backends
### SQLite (development default)
@@ -113,21 +144,27 @@ lands in WP-0004.
| `artifactstore migrate` | Run `alembic upgrade head` against the configured database. |
| `artifactstore replay` | Truncate every materialised view and rebuild it from the event log; prints the highest sequence applied. |
| `artifactstore health` | JSON liveness summary (db, backend, status). Same payload as the HTTP `/health` endpoint. |
| `artifactstore push <dir>` | Push a directory through the HTTP API and finalize the package. |
| `artifactstore manifest <package_id>` | Fetch the JSON manifest projection through the HTTP API. |
| `artifactstore retention sweep` | Run one deletion-eligibility sweep against the configured DB. |
The CLI is a thin client over `artifactstore.registry.Registry`
(see [ADR-0005](adr/0005-v1-tech-stack.md)).
## HTTP reference (v0.1)
| Route | Purpose |
|----------------|---------|
| `GET /` | Service banner (scaffold marker). |
| `GET /health` | Liveness summary. Returns ``{status, db, backend, version}``. `status` is `ok` only when both the DB probe (`SELECT 1`) and the backend `health()` succeed. |
| `GET /docs` | FastAPI's interactive OpenAPI docs (`/openapi.json` underneath). |
| Route family | Purpose |
|-----------------------|---------|
| `GET /`, `GET /health` | Anonymous service banner and liveness summary. |
| `GET /docs`, `GET /openapi.json` | FastAPI's interactive OpenAPI docs and generated schema. |
| `/packages...` | Create, list, inspect, upload files to, finalize, and retrieve manifests for packages. |
| `/files...` | File metadata and byte downloads, including single-range reads. |
| `/uploads...` | Upload-session wire shape for whole-body v1 uploads. |
| `/packages/{id}/retention...` | Extend retention, apply/release holds, and read retention history. |
| `GET /events` | Long-poll event feed, CBOR by default or JSON with `Accept: application/json`. |
Package CRUD, file upload/download, manifest retrieval, retention controls,
and the event stream all land in WP-0002WP-0003. Today they are reachable
via the Python library.
All non-health routes require a bearer token unless
`ARTIFACTSTORE_ANON_READ=true` is set for read endpoints.
## End-to-end smoke test (Python library)