Commit Graph

14 Commits

Author SHA1 Message Date
f8097cb683 WP-0001-T002: registry data model, Alembic, initial migration with retention seed
Schema (src/artifactstore/db/schema.py):
- events table (ADR-0002 source of truth): sequence BIGSERIAL PK, created_at,
  event_type, subject_kind, subject_id, actor, payload (CBOR bytes),
  payload_digest. Indexes on (subject_kind, subject_id) and
  (event_type, sequence).
- artifact_packages, artifact_files, storage_locations, retention_state
  (materialised views over events).
- retention_classes (seed table) and metadata_schemas (config table).
- ADR-0001 columns present: digest_algorithm, digest_primary, digest_sha256,
  content_address. Blueprint tiering columns present: retrieval_tier
  (default 'hot'), restore_status.
- Types portable: SQLAlchemy 2.0 Core with JSON().with_variant(JSONB, 'postgresql'),
  Uuid, LargeBinary, DateTime(timezone=True), Boolean false() default.

Seed (src/artifactstore/db/seed.py): five v1 retention classes (transient,
raw-evidence, summary-evidence, release-evidence, permanent-record) with
default durations in seconds; permanent-record has no expiry.

Alembic:
- alembic.ini with sync sqlite URL default; path_separator=os to silence the
  1.13 deprecation warning.
- migrations/env.py: translates async URLs (+aiosqlite, +asyncpg) to sync
  counterparts at migrate-time so a single ARTIFACTSTORE_DATABASE_URL works
  for both runtime (async) and Alembic (sync).
- migrations/script.py.mako template.
- migrations/versions/20260516_0001_initial.py: metadata.create_all + bulk
  insert of retention class seeds.

Make:
- make migrate: alembic upgrade head (ensures var/ exists).
- make migrate-fresh: drop local SQLite + re-run.

Deps: psycopg[binary] added as optional `postgres` extra (PostgreSQL prod
path; SQLite default for dev needs no extra).

Tests:
- tests/unit/test_db_schema.py: every expected table present; ADR-0001 and
  tiering columns present; seed has the five v1 classes; permanent-record
  has no default_duration; create_all + FK insert + Boolean default
  round-trip on in-memory SQLite.
- tests/integration/test_migrations.py: alembic upgrade head against a
  tempfile SQLite produces all tables (+ alembic_version) and the seed rows.

Gates: ruff clean, mypy --strict clean on 32 files, 38 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 01:50:38 +02:00
d14ee517d9 chore(consistency): sync task status from DB [auto]
Updated by fix-consistency on 2026-05-16:
  - update .custodian-brief.md for artifact-store
2026-05-16 01:49:55 +02:00
9cbb9847ed WP-0001-T010: manifest model, canonical CBOR codec, JCS projection
Adds the manifest layer per ADR-0003. The canonical wire format is CBOR with
deterministic encoding (cbor2 canonical=True: definite-length, shortest-form
integers, sorted map keys); JCS (RFC 8785) is the JSON projection.

src/artifactstore/manifest/:
- model.py: frozen dataclasses for Manifest (manifest_version=1, package,
  files, storage_receipts, retention_summary, provenance) with restricted
  types (str/int/bool/None/list/dict) so CBOR and JCS round-trip losslessly.
- codec.py: encode (Manifest -> canonical CBOR bytes) and decode (CBOR bytes
  -> Manifest) via cbor2.
- projection.py: jcs_projection (Manifest -> RFC 8785 canonical JSON) plus
  cbor_from_jcs for cross-format round-trip verification.
- digest.py: manifest_digest returns the BLAKE3 content address of the
  manifest's canonical CBOR bytes (ADR-0001).
- __init__.py: re-exports the public surface.

tests/unit/test_manifest.py:
- decode(encode(m)) == m round-trip (hypothesis-parameterised).
- JCS↔CBOR round-trip: encode(decode(cbor_from_jcs(jcs(m)))) == encode(m).
- Byte stability of the canonical CBOR encoder across calls.
- manifest_digest matches independent BLAKE3 over encode(m).
- Decode rejects non-map CBOR.
- JCS projection sorts keys lexicographically.

Deps: jcs added to project requirements; mypy override for the jcs package
(no stubs published yet).

Gates: ruff clean, mypy --strict clean on 26 files, 26 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 01:39:42 +02:00
c1bfb8b486 WP-0001-T009: digest abstraction and content address (ADR-0001)
src/artifactstore/identity/__init__.py:
- Digest: frozen, hashable dataclass (algorithm + lowercase hex), validated.
- ContentAddress: canonical `<algorithm>:<hex>` string form with validating
  parser (to_digest) and emitter (str / from_digest).
- DigestPair: dual-digest result (primary + sha256) from a single hashing pass.
- Algorithm registry: register_algorithm / get_algorithm / list_algorithms
  with name validation `[a-z][a-z0-9_-]*`.
- digest_bytes (sync) and digest_stream (async) — single-pass dual hashing.
- BLAKE3 registered as PRIMARY_ALGORITHM, SHA-256 as INTEROP_ALGORITHM at
  module import.

tests/unit/test_identity.py:
- Hypothesis property test asserts digest_bytes matches hashlib.sha256 and
  blake3.blake3 for random byte sequences up to 4 KiB.
- digest_stream invariants: equivalence with digest_bytes under chunked input;
  defaults to BLAKE3 primary; always computes SHA-256; handles empty input.
- Digest / ContentAddress invariants: rejects uppercase hex, empty fields,
  odd hex length, missing separator; frozen and hashable.

Gates: ruff clean, mypy --strict clean on 21 source files, 18 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 01:34:24 +02:00
6a136dd814 chore(consistency): sync task status from DB [auto]
Updated by fix-consistency on 2026-05-16:
  - update .custodian-brief.md for artifact-store
2026-05-16 01:34:11 +02:00
a6b6746f91 WP-0001-T001: service scaffold (Python, FastAPI, uv, ruff, mypy, pytest)
Lands the smallest credible foundation per ADR-0005:

- pyproject.toml: hatchling build, runtime deps (FastAPI, uvicorn, SQLAlchemy 2.0,
  asyncpg, aiosqlite, alembic, blake3, cbor2, typer, structlog, pydantic,
  pydantic-settings); dev deps (pytest, pytest-asyncio, httpx, hypothesis, ruff,
  mypy); ruff + mypy --strict + pytest configured.
- uv.lock committed.
- Makefile thin shims: install / dev / test / lint / format / type / migrate / clean.
- src/artifactstore/ package skeleton with placeholder __init__.py per concern:
  identity, manifest, events, retention, audit, storage, dataplane, registry,
  api/http (minimal FastAPI app, GET / scaffold banner), cli (typer app with
  version subcommand), config (pydantic-settings).
- tests/{unit,integration}/conftest.py present; unit smoke tests assert package
  imports, HTTP root route, CLI version round-trip, settings defaults.
- .env.example documents ARTIFACTSTORE_DATABASE_URL,
  ARTIFACTSTORE_STORAGE_LOCAL_ROOT, ARTIFACTSTORE_LOG_LEVEL.
- README updated with install / dev / test instructions.
- .gitignore: claude local state, local runtime data (var/, sqlite db).

make lint && make type && make test pass on a clean checkout (4 tests, 20
source files type-clean under mypy --strict).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 01:30:22 +02:00
b1eba9b41e chore(consistency): sync task status from DB [auto]
Updated by fix-consistency on 2026-05-15:
  - update .custodian-brief.md for artifact-store
2026-05-15 23:17:03 +02:00
8eee5a1c1c chore(consistency): sync new task and workstream IDs from DB [auto]
fix-consistency assigned state_hub_task_id and state_hub_workstream_id UUIDs
to the tasks and workplans added in 747afc2.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 21:26:08 +02:00
b4762125d8 chore(consistency): sync task status from DB [auto]
Updated by fix-consistency on 2026-05-15:
  - update .custodian-brief.md for artifact-store
2026-05-15 21:25:37 +02:00
747afc27a6 docs+plans: reconcile blueprint with ambition, add ADRs, sequence workplans
Aligns the v1 architecture with the longer-horizon platform thesis so we can
start implementation without the schema-level inconsistencies the prior
review surfaced.

ADRs (docs/adr/0001..0006): content-addressed dual-digest storage, append-only
event log as source of truth, canonical CBOR manifests, control/data-plane
contract, v1 tech stack (Python 3.12 / uv / FastAPI / SQLAlchemy Core +
asyncpg / Alembic / cbor2 / blake3 / ruff / mypy / pytest / typer), OCI
compatibility kept reachable.

Architecture blueprint rewritten to v2: library-first (ffmpeg-shaped) module
layout, materialised-view data model over the event log, upload-session and
event-stream endpoints pinned, retrieval tiering promoted into the schema.

Roadmap added (docs/ROADMAP.md) with three phases. WP-0001 rewritten as the
Foundation plan (scaffold + kernels + local FS + minimal app). WP-0002..0005
created carrying the existing state_hub_task_ids forward semantically:
ingestion API (T004), retention lifecycle (T005), S3-compatible backend
(T006), guide-board pilot (T007). T001/T002/T003/T008 remain in WP-0001
with refined acceptance.

README and AGENTS.md refreshed to reflect the new repo shape.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 21:16:17 +02:00
403d903585 docs: add platform ambition, blueprint review, and assembly experiment
Captures the longer-horizon thesis (sovereign-cloud artifact substrate)
alongside the carefully-scoped v1 INTENT. PLATFORM-AMBITION records nine
schema/contract commitments the v1 must preserve to keep that horizon
reachable. ASSEMBLY-EXPERIMENT frames an opt-in research line on
ffmpeg-grade hand-tuned asm with an MIT-0 vs LGPL-aware reuse map.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 20:56:01 +02:00
793c0c7ba5 Bootstraping the repo 2026-05-15 20:08:32 +02:00
c99ffe429f chore(consistency): sync task status from DB [auto]
Updated by fix-consistency on 2026-05-15:
  - update .custodian-brief.md for artifact-store
2026-05-15 18:31:04 +02:00
Coulomb Social
bc11bcb1f6 Initial commit 2026-05-15 16:14:36 +00:00