Files
artifact-store/docs/OPERATOR.md

11 KiB

Operator Guide

Status: v0.1 (WP-0003 baseline) Updated: 2026-05-16

This guide is the user manual for running artifact-store v0.1 — the library, CLI, HTTP ingestion API, manifest surface, and retention lifecycle.

For architectural background see ARCHITECTURE-BLUEPRINT.md, the ADRs under adr/, and the ROADMAP.

Prerequisites

  • Python 3.12 or 3.13
  • uv on the PATH (one static binary)
  • A POSIX-ish shell (Linux, macOS, WSL2)

The pinned tech stack is documented in ADR-0005.

Quick start

uv sync --all-extras       # install deps; produces .venv/ and uv.lock
cp .env.example .env       # optional — the defaults work out of the box

make migrate-fresh         # creates ./var/artifactstore.db and applies migrations
make dev                   # uvicorn on 127.0.0.1:8000

In another terminal:

curl -s http://127.0.0.1:8000/health | python3 -m json.tool
artifactstore health

Both should report status: ok.

Environment variables

All settings are prefixed with ARTIFACTSTORE_ and read by pydantic-settings from the environment and (optionally) ./.env.

Variable Default Purpose
ARTIFACTSTORE_DATABASE_URL sqlite+aiosqlite:///./var/artifactstore.db SQLAlchemy async URL. Alembic translates +aiosqlite and +asyncpg to their sync drivers at migrate-time.
ARTIFACTSTORE_STORAGE_LOCAL_ROOT ./var/storage Root directory for the local filesystem storage backend. Created on first use.
ARTIFACTSTORE_LOG_LEVEL INFO Python logging level (DEBUG / INFO / WARNING / ERROR).
ARTIFACTSTORE_AUTH_TOKENS empty Comma- or newline-separated shared-secret bearer tokens for the HTTP API.
ARTIFACTSTORE_ANON_READ false Set true only for local demos where read endpoints may be anonymous.
ARTIFACTSTORE_API_URL http://127.0.0.1:8000 Default API base URL used by HTTP-backed CLI commands.
ARTIFACTSTORE_API_TOKEN empty Default bearer token used by HTTP-backed CLI commands.
ARTIFACTSTORE_RETENTION_CONFIG_PATH empty Optional TOML file overriding retention-class default durations.
ARTIFACTSTORE_RETENTION_SWEEP_INTERVAL_SECONDS 3600 Default interval for external schedulers that invoke the retention sweeper.

See .env.example for the canonical template.

Retention policy TOML

By default, retention durations come from the seeded retention_classes rows. Operators can override the default duration per class with ARTIFACTSTORE_RETENTION_CONFIG_PATH:

[retention_classes.transient]
default_duration_seconds = 86400

[retention_classes."raw-evidence"]
default_duration_seconds = 7776000

[retention_classes."summary-evidence"]
default_duration_seconds = 31536000

[retention_classes."release-evidence"]
default_duration_seconds = 220752000

[retention_classes."permanent-record"]
# Omit default_duration_seconds for no expiry.

Run artifactstore retention sweep from cron or another scheduler to mark expired, unheld packages eligible for deletion. This work only records eligibility; it never deletes bytes.

Database backends

SQLite (development default)

Zero-config. The database file lives at ./var/artifactstore.db by default and is gitignored.

make migrate-fresh    # drop and re-create
make migrate          # idempotent: apply pending migrations

PostgreSQL 16+ (shared deployments)

Install the optional postgres extra (pulls in psycopg[binary] for Alembic's sync driver):

uv sync --all-extras --extra postgres

Set the URL with the async driver; Alembic switches to +psycopg for migrations automatically:

export ARTIFACTSTORE_DATABASE_URL=postgresql+asyncpg://artifactstore:secret@db.internal:5432/artifactstore
make migrate

The schema is identical to SQLite (per ADR-0002 the events table drives all materialised views).

Storage backends

The storage adapter SPI is documented in ADR-0001 and ADR-0004.

Local filesystem (default)

Objects are addressed by content (blake3:<hex>) and laid out as

<root>/<algorithm>/<hex[0:2]>/<hex[2:4]>/<hex>

with atomic writes (tmpfile + fsync + rename). The S3-compatible backend lands in WP-0004.

CLI reference

artifactstore --help lists every subcommand. The v0.1 set:

Command Purpose
artifactstore version Print the package version and exit.
artifactstore migrate Run alembic upgrade head against the configured database.
artifactstore replay Truncate every materialised view and rebuild it from the event log; prints the highest sequence applied.
artifactstore health JSON liveness summary (db, backend, status). Same payload as the HTTP /health endpoint.
artifactstore push <dir> Push a directory through the HTTP API and finalize the package.
artifactstore manifest <package_id> Fetch the JSON manifest projection through the HTTP API.
artifactstore retention sweep Run one deletion-eligibility sweep against the configured DB.

The CLI is a thin client over artifactstore.registry.Registry (see ADR-0005).

HTTP reference (v0.1)

Route family Purpose
GET /, GET /health Anonymous service banner and liveness summary.
GET /docs, GET /openapi.json FastAPI's interactive OpenAPI docs and generated schema.
/packages... Create, list, inspect, upload files to, finalize, and retrieve manifests for packages.
/files... File metadata and byte downloads, including single-range reads.
/uploads... Upload-session wire shape for whole-body v1 uploads.
/packages/{id}/retention... Extend retention, apply/release holds, and read retention history.
GET /events Long-poll event feed, CBOR by default or JSON with Accept: application/json.

All non-health routes require a bearer token unless ARTIFACTSTORE_ANON_READ=true is set for read endpoints.

End-to-end smoke test (Python library)

This exercises every layer (identity, manifest, events, dataplane, storage, registry, replay) end-to-end against the default SQLite + local FS configuration.

import asyncio
from collections.abc import AsyncIterator
from artifactstore.app import build_registry
from artifactstore.manifest import decode as manifest_decode


async def chunks(data: bytes) -> AsyncIterator[bytes]:
    yield data


async def main() -> None:
    registry = build_registry()
    try:
        pkg = await registry.create_package(
            name="smoke-test",
            producer="ops",
            subject="example.org",
            retention_class="raw-evidence",
            actor="ops",
            metadata={"smoke": True},
        )
        await registry.ingest_file(
            pkg, relative_path="hello.txt", media_type="text/plain",
            stream=chunks(b"hello world"), actor="ops",
        )
        manifest_addr = await registry.finalize_package(pkg, actor="ops")
        cbor = await registry.get_manifest_bytes(pkg, format="cbor")
        manifest = manifest_decode(cbor)
        print("package:", pkg)
        print("manifest digest:", manifest_addr)
        print("files in manifest:", [f.relative_path for f in manifest.files])
    finally:
        await registry.dispose()


asyncio.run(main())

Prerequisites: make migrate-fresh has been run so the schema and the retention class seeds exist.

Replay / disaster recovery

Every state-changing operation writes one row to events and updates the materialised views in the same transaction (ADR-0002). If the materialised views are lost or corrupted, rebuild them from the event log:

artifactstore replay

The command drops every row from artifact_packages, artifact_files, storage_locations, and retention_state, then replays the events in sequence order through the canonical view writer. The result is byte-identical to the materialised state before the replay (verified by the WP-0001-T013 integration test).

Failure modes operators should expect

Symptom Likely cause Fix
/health returns status: degraded, db.healthy: false DB unreachable or migrations not applied Check ARTIFACTSTORE_DATABASE_URL; run make migrate.
/health returns status: degraded, backend.healthy: false Storage root missing or unreadable Recreate ARTIFACTSTORE_STORAGE_LOCAL_ROOT or fix permissions.
ObjectNotFoundError from get_file Underlying bytes deleted but the file row remains Investigate; v1 does not garbage-collect orphaned rows (WP-0006).
DuplicateRelativePathError from ingest_file Same package + path ingested twice Use a distinct relative_path per file within one package.

References

Architecture Decision Records