docs: add platform ambition, blueprint review, and assembly experiment

Captures the longer-horizon thesis (sovereign-cloud artifact substrate) alongside the carefully-scoped v1 INTENT. PLATFORM-AMBITION records nine schema/contract commitments the v1 must preserve to keep that horizon reachable. ASSEMBLY-EXPERIMENT frames an opt-in research line on ffmpeg-grade hand-tuned asm with an MIT-0 vs LGPL-aware reuse map. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 20:56:01 +02:00
parent 793c0c7ba5
commit 403d903585
4 changed files with 616 additions and 3 deletions
--- a/README.md
+++ b/README.md
@@ -9,6 +9,9 @@ as a local filesystem, S3-compatible object storage, or Ceph RGW.

 Start here:

- [INTENT.md](INTENT.md)
- [docs/ARCHITECTURE-BLUEPRINT.md](docs/ARCHITECTURE-BLUEPRINT.md)
- [workplans/ARTIFACT-STORE-WP-0001-service-baseline.md](workplans/ARTIFACT-STORE-WP-0001-service-baseline.md)
+- [INTENT.md](INTENT.md) — purpose, product thesis, scope, boundary
+- [docs/ARCHITECTURE-BLUEPRINT.md](docs/ARCHITECTURE-BLUEPRINT.md) — draft architecture
+- [docs/PLATFORM-AMBITION.md](docs/PLATFORM-AMBITION.md) — longer-horizon thesis and the schema commitments v1 preserves
+- [docs/REVIEW-2026-05-15-intent-and-blueprint.md](docs/REVIEW-2026-05-15-intent-and-blueprint.md) — SWOT and optimisation review
+- [docs/ASSEMBLY-EXPERIMENT.md](docs/ASSEMBLY-EXPERIMENT.md) — opt-in research line on hand-tuned assembly for hot kernels
+- [workplans/ARTIFACT-STORE-WP-0001-service-baseline.md](workplans/ARTIFACT-STORE-WP-0001-service-baseline.md) — first implementation workplan
--- a/docs/ASSEMBLY-EXPERIMENT.md
+++ b/docs/ASSEMBLY-EXPERIMENT.md
@@ -0,0 +1,209 @@
+# Assembly Experiment
+
+Status: draft / opt-in research line
+Created: 2026-05-15
+
+This document defines an opt-in research line under `artifact-store`: can
+agentic coding adopt, extend, and eventually originate ffmpeg-grade hand-
+written assembly for the hot paths of an artifact-storage data plane?
+
+This is a research experiment, not roadmap-critical work. The platform
+ambition (`docs/PLATFORM-AMBITION.md`) stands on its own merits whether or
+not we ever write a single line of assembly. The experiment runs alongside.
+
+## Why this experiment exists
+
+ffmpeg is the empirical proof that hand-written assembly with runtime CPU
+dispatch still substantially outperforms even the best Rust-with-SIMD-
+intrinsics codebases for tight inner loops — often by 1.5–3× on the same
+hardware, sometimes more. The cost is steep: domain expertise, multi-arch
+maintenance, calling-convention discipline, microarchitecture awareness.
+ffmpeg has decades of contributor depth to amortise that cost.
+
+We do not have that depth. The interesting question is whether large
+language models, used as coding agents, change the cost equation enough to
+make this approach viable for a focused project. If they do, an artifact
+substrate that competes on raw throughput-per-core has a real edge against
+generic object stores. If they do not, we adopt prebuilt asm-tuned
+libraries and lose nothing.
+
+## Strategic context
+
+This experiment ties to the commercial horizon recorded in
+`docs/PLATFORM-AMBITION.md`. A sovereign-cloud artifact product that
+ingests, hashes, dedups, and serves bytes at noticeably higher
+throughput-per-core than commodity object stores has a defensible edge.
+"Cheaper per-GB than AWS" is a losing race; "more throughput per server,
+on hardware you already own" is not.
+
+## Constraints
+
+### Licence
+
+- `artifact-store` is MIT No Attribution.
+- ffmpeg's `libavutil` (where the storage-relevant asm lives) is LGPL 2.1+.
+- We **cannot** copy LGPL-licensed asm into MIT-0 source.
+- We **can**:
+  - dynamically link to `libavutil` at runtime (users get both licences);
+  - re-license a *segregated optional native module* under LGPL 2.1+ while
+    the rest of the repo stays MIT-0, provided the module is its own
+    package and the boundary is explicit;
+  - read LGPL code and implement the same algorithm from scratch
+    (algorithms are not copyrightable; specific source text is). This is
+    the standard practice for clean-room reimplementation. Document the
+    process per file.
+  - prefer asm sources under permissive licences (BSD, Apache, CC0,
+    public domain) where they exist.
+
+Preferred upstream licences for the experiment, in order:
+
+1. Public domain / CC0 (Intel reference, BLAKE3 reference)
+2. Apache-2.0 / BSD / MIT (xxhash, zstd, ring)
+3. LGPL via dynamic linking (libavutil)
+4. Clean-room reimplementation inspired by LGPL (last resort)
+
+### Maintenance budget
+
+The experiment is bounded. Any asm we adopt or write must:
+
+- have a portable C / Rust fallback that is correctness-equivalent;
+- be reachable through a runtime CPU-feature dispatch table (the ffmpeg
+  pattern) so the binary still runs on machines without the relevant
+  extension;
+- carry a test that compares its output byte-for-byte against the fallback
+  on randomised inputs;
+- carry a microbenchmark with a recorded baseline so regressions are
+  visible.
+
+If we cannot meet those four bars for a candidate, we ship the library
+implementation and revisit later.
+
+## What ffmpeg actually has that is reusable here
+
+Inspection of `libavutil/x86/` (2026-05-15) found the following
+storage-relevant assets:
+
+| File / module                | What it accelerates           | Reuse value for artifact-store |
+|------------------------------|-------------------------------|--------------------------------|
+| `x86/crc.asm`                | CRC-32 (LE + BE) via PCLMULQDQ | **High.** Fast non-crypto integrity check for chunks and network framing. Public function names `ff_crc_le`, `ff_crc`. LGPL — must dynamic-link or reimplement. |
+| `x86/aes.asm` + `aes_init.c` | AES block cipher              | **Low–medium.** ffmpeg's AES is unauthenticated. At-rest encryption needs AES-GCM, better adopted from Ring / BoringSSL / AWS-LC (permissive licences, FIPS-validatable). |
+| `x86/cpuid.asm` + `cpu.c`    | CPU feature detection         | **High (pattern, not code).** Reimplement the `ff_get_cpu_flags_x86()` + `AV_CPU_FLAG_*` pattern under MIT-0. This is the dispatch backbone. |
+| `x86/x86inc.asm`             | Macro library for asm authoring | **High (technique).** Cross-platform calling conventions, register naming, function prologue/epilogue. ffmpeg's macros are the de-facto standard outside game-dev. NASM-syntax. |
+| `x86/x86util.asm`            | SIMD helper macros            | **Medium.** Useful patterns; not directly liftable. |
+| `x86/emms.asm`               | MMX state clearing            | **Zero.** Legacy. |
+| `sha.c`                      | SHA-1 / SHA-224 / SHA-256     | **Zero.** Pure C, no SIMD. We are better off with BLAKE3 (asm-tuned upstream) and SHA-NI via OpenSSL / Ring for SHA-256. |
+| `aes_ctr.c`, `blowfish.c`, `camellia.c`, `cast5.c`, `des.c` | Block ciphers | **Zero.** Not relevant for our threat model. |
+| `adler32.c`, `crc.c`         | Reference integrity (C)        | **Zero.** Use the asm-accelerated variants. |
+
+Everything in `libavcodec` (DCT, motion estimation, deblocking) and the
+video / audio / image-utility `.asm` files in `libavutil` is irrelevant to
+artifact-store and stays out of scope.
+
+## Candidate hot kernels for artifact-store, ranked
+
+Each kernel below is a candidate either for adoption (drop in a vetted
+permissive library), extension (start from a permissive baseline and
+optimise further), or origination (write fresh).
+
+### Tier 1 — adopt now, do not write
+
+| Kernel        | Recommended source                                    | Notes |
+|---------------|-------------------------------------------------------|-------|
+| BLAKE3        | `blake3` (C reference + Rust crate), Apache-2.0 / CC0 | Already ships hand-tuned AVX-512, AVX2, SSE4.1, ARM NEON, ARM64. We will never beat upstream. |
+| SHA-256 (compat) | OpenSSL / Ring / AWS-LC, permissive                | Uses SHA-NI on supporting CPUs. |
+| AES-GCM       | Ring / BoringSSL, ISC / BSD                           | AES-NI + PCLMULQDQ for GHASH. Authenticated; what we actually need. |
+| Zstandard     | `zstd` (Facebook), BSD-3                              | Multi-GB/s with SIMD. |
+| LZ4           | `lz4`, BSD-2                                          | Faster than zstd at lower ratio; useful for high-throughput cold paths. |
+
+### Tier 2 — adopt + extend, this is where the experiment starts
+
+| Kernel             | Baseline source                              | Extension question |
+|--------------------|----------------------------------------------|--------------------|
+| FastCDC (rolling hash) | `fastcdc-rs` (MIT) or original C paper code | Can we squeeze a SIMD'd Gear-hash variant that maintains the same boundary distribution? Existing Rust impl is scalar. |
+| CRC-32C (Castagnoli, for chunk integrity) | Intel reference white paper code (public domain) | PCLMULQDQ-accelerated; ffmpeg's `crc.asm` shows the technique under LGPL — reimplement under MIT-0 from the Intel paper. |
+| xxhash3            | `xxhash` (BSD-2)                             | Already SIMD'd; the extension is whether we can fuse it with our chunk-boundary loop to read each byte once. |
+| Manifest canonicalisation hash | Whatever canonical-CBOR lib we pin | Likely no asm needed; included to monitor whether it ever appears on a profile. |
+
+### Tier 3 — originate, only if profiles justify it
+
+These are deliberately speculative. None of them are committed work.
+
+- A fused "scan + chunk + hash" pass that reads each byte from the
+  upload buffer once and emits chunk boundaries plus per-chunk BLAKE3
+  state in a single pass. Today this requires three passes (CDC, hash
+  per chunk, hash for manifest root).
+- A SIMD'd content-type sniffer for the first N kilobytes of unknown
+  uploads.
+- An AVX-512 implementation of a bloom / cuckoo filter probe for the
+  "have I seen this hash?" hot path.
+- Fast batch verification: given a list of `(content_address, bytes)`
+  pairs, verify all of them in one SIMD-dispatched pass.
+
+## Experiment protocol
+
+For each Tier 2 or Tier 3 candidate that we take on:
+
+1. **Frame the kernel.** One function, one clear input / output, one
+   measurable metric (bytes per second per core).
+2. **Baseline.** Land a portable C or Rust implementation with full test
+   coverage and a recorded microbenchmark number.
+3. **Dispatch.** Wire the kernel through the runtime CPU-feature
+   dispatcher (ffmpeg pattern, reimplemented MIT-0). Default path = the
+   baseline.
+4. **Agentic asm attempt.** Use the coding agent to author a NASM-syntax
+   asm implementation targeting one ISA extension (start with AVX2 — most
+   broadly available). The agent must:
+   - produce annotated source with cycle-accurate comments where relevant;
+   - include the test that compares its output to baseline on randomised
+     input;
+   - include the microbenchmark.
+5. **Independent review.** A second pass — human or a fresh agent context
+   — reviews for correctness, calling-convention compliance, and obvious
+   microarchitectural issues (false dependencies, port pressure, unaligned
+   loads, misuse of `vzeroupper`).
+6. **Land or shelve.** If the asm beats the baseline by a meaningful
+   margin (≥ 1.5×) and passes review, it lands behind the dispatcher.
+   Otherwise it shelves with the benchmark numbers recorded so we know
+   not to retry without new techniques.
+7. **Extend.** Repeat for AVX-512, then ARM NEON, then SVE2, in that
+   order of impact.
+
+Each completed kernel produces an ADR-style note in `docs/asm/` recording
+the algorithm, the source of inspiration, the licence chain, the
+benchmark numbers, and any microarchitectural notes.
+
+## What the experiment proves or disproves
+
+A succeeding experiment delivers:
+
+- a portable asm-accelerated data plane that competes with hand-tuned C
+  storage stacks on throughput;
+- a public record of which kernels the agentic approach handles well and
+  which it does not;
+- a reusable dispatcher and macro foundation that other projects can adopt.
+
+A failing experiment delivers:
+
+- a published record of where agentic coding plateaus on hot-path asm;
+- an artifact-store data plane that is still very good — because the
+  baseline is "use the asm-tuned library", which is already fast.
+
+Either outcome is publishable. The downside is bounded.
+
+## Out of scope for this experiment
+
+- Cryptography written by us. Use vetted libraries. Always.
+- Architectures with small deployment footprints in this domain (RISC-V,
+  POWER, MIPS). Revisit once x86_64 and ARM64 are solid.
+- Kernel-bypass networking (DPDK, eBPF/XDP storage). Different
+  experiment, different document if we ever pursue it.
+- GPU offload. Different cost model; not addressed here.
+
+## Immediate next steps
+
+None are committed. When the v1 baseline (WP-0001) lands and we have a
+real profile of where time is spent, the first candidate to pick up is
+almost certainly **FastCDC + BLAKE3 in a single pass**, because that is
+the documented bottleneck of every CAS-style storage system that has
+profiled it (restic, borg, kopia). Until then, this document is a
+holding place for the ambition.
--- a/docs/PLATFORM-AMBITION.md
+++ b/docs/PLATFORM-AMBITION.md
@@ -0,0 +1,226 @@
+# Platform Ambition
+
+Status: draft
+Created: 2026-05-15
+
+This document records the longer-horizon thesis behind `artifact-store` and
+captures which decisions are taken now to keep that horizon reachable without
+expanding the v1 workplan. It sits beside, not above, `INTENT.md` and
+`SCOPE.md`. INTENT defines what we build first; this document defines what
+the v1 must not foreclose.
+
+## Thesis
+
+Generated artifacts — evidence packages, build outputs, ML models, logs,
+snapshots, reports, scorecards, exports — are first-class durable objects in
+modern software work. They sit somewhere between source code (well-served by
+Git) and binary releases (well-served by OCI registries). The space between
+is currently filled by a fragmented mix of bespoke directories, ad-hoc S3
+buckets, vendor registries (Artifactory, Nexus), and document-management
+systems that were not built for machine producers.
+
+`artifact-store` aims to occupy that gap with one substrate: a generic,
+content-addressed, signed, deduplicated, retention-aware artifact registry
+and storage gateway that other tools embed or speak to.
+
+The reference points are deliberate. **VLC** and **ffmpeg** lead their domain
+not by being the prettiest applications but by being correct, fast, embeddable,
+portable, and indispensable infrastructure for everyone else. The same
+strategy applies here: build a kernel that is so good at the bytes-and-
+identity layer that every artifact-producing tool would rather speak its
+protocol than reinvent it.
+
+## Commercial horizon
+
+The longer-horizon commercial target is **a sovereign artifact-storage
+product line for European cloud providers** — Stack IT (Schwarz Group) is
+the concrete example. The thesis is:
+
+- Hyperscalers (AWS S3, GCS, Azure Blob) sell raw object storage. They do
+  not sell *artifact identity, retention, attestation, federation, evidence
+  preservation*. Customers either build it themselves or buy proprietary
+  registries on top.
+- A European hyperscaler that ships a turnkey, sovereign, GDPR-aligned
+  artifact substrate on top of its own object storage has a defensible
+  differentiation against AWS — not in raw price-per-GB, which is a losing
+  race, but in regulated workloads (evidence retention, audit, signed
+  attestations, legal-hold, sovereign jurisdiction guarantees).
+- Open source is the wedge. A widely-adopted upstream that the provider
+  ships, supports, and extends is far stronger than a proprietary stack.
+
+This is a multi-year horizon, not a v1 deliverable. It is recorded here so
+schema and protocol decisions made now keep that path open.
+
+## Reference points
+
+| System          | What we learn from it                                        |
+|-----------------|--------------------------------------------------------------|
+| ffmpeg          | Embeddable core, hand-tuned hot paths, runtime CPU dispatch  |
+| VLC             | Plugin architecture, portability, ubiquity through being a library too |
+| Git             | Content-addressed storage, Merkle DAG, pack files, integrity |
+| restic          | Single static binary, CDC + dedup, encryption by default     |
+| IPFS            | Content-addressing, federation, partial replication          |
+| OCI Registry    | Standardised manifest + blob model with broad ecosystem      |
+| Sigstore / cosign | Signed attestations as a first-class artifact property     |
+| MinIO           | Operator ergonomics, S3 wire compat as adoption vector       |
+| SeaweedFS / Ceph | Separation of metadata plane from data plane                |
+| RocksDB / LMDB  | Embeddable storage engines with predictable performance      |
+| Kafka           | Log-as-source-of-truth, materialised views                   |
+| BLAKE3          | Modern hash primitive: parallel, Merkle-tree-native, asm-tuned |
+
+We are not trying to reproduce any of these. We are trying to occupy a
+specific gap between them with the best ideas from each.
+
+## Non-goals (still)
+
+The platform ambition does not change the v1 boundary in `INTENT.md` or
+`SCOPE.md`. In particular it does not:
+
+- replace StateHub as the work / decision system of record;
+- encode producer-specific assessment semantics in the registry core;
+- require any of the optimisations listed in the "near-horizon" section
+  below to land in v1;
+- commit the project to writing assembly. The assembly-experiment line
+  (`docs/ASSEMBLY-EXPERIMENT.md`) is opt-in research, not roadmap-critical.
+
+## Architectural commitments — preserved by v1
+
+The following decisions are taken now because reversing them later is
+expensive. Each lands in v1 as a schema or contract decision; full
+exploitation is later work.
+
+### A1. Content as the primary address
+
+Internal canonical key for stored bytes is `<algo>:<digest>`, not a logical
+path. Files within a package keep a `relative_path` as logical metadata,
+but the storage backend sees and addresses content hashes.
+
+- Enables: global dedup, Merkle integrity proofs, partial mirrors,
+  federation, OCI compatibility.
+- v1 cost: one schema column (`content_address`) and a deterministic key
+  derivation; no behaviour change.
+
+### A2. BLAKE3 as native digest, SHA-256 retained for interop
+
+`digest_algorithm` is a column on `artifact_files`. v1 default may remain
+`sha256` to ship the pilot quickly; the column exists so `blake3` can ship
+without migration.
+
+- Enables: faster hashing, free Merkle root over a package, alignment with
+  modern signing tooling.
+- v1 cost: column + adapter table mapping algo → hashing impl.
+
+### A3. Append-only event log as source of truth
+
+An `events` table with a monotonic sequence number is the authoritative
+record of registry mutations. The current metadata tables are a
+materialised view rebuildable from the log.
+
+- Enables: CDC feeds, audit, replication, point-in-time recovery, signed
+  event streams.
+- v1 cost: one extra table written on the same transaction as today's
+  mutations.
+
+### A4. Signed manifests, canonicalisation pinned
+
+Manifest serialisation uses a canonical form (recommendation: canonical
+CBOR; JCS as alternative) so byte-identical signing is possible across
+languages and time. v1 may not actually sign — the pin guarantees that
+when signing lands, every prior manifest is re-signable byte-for-byte.
+
+- Enables: cosign / Sigstore, in-toto, SLSA attestations, OCI-style
+  manifest digests.
+- v1 cost: pick one canonicalisation library and use it for manifest
+  writes. Zero runtime cost.
+
+### A5. Control plane / data plane separation at the contract
+
+Even if v1 implements both in one Python process, the boundary between
+"registry / API / retention" (control plane) and "hash / chunk / store /
+serve" (data plane) is a named contract. When the data plane is later
+extracted into a Rust binary, the API does not change.
+
+- Enables: native-speed ingestion, language flexibility on the hot path,
+  independent scaling.
+- v1 cost: discipline (separate Python module with no API leakage), not
+  code.
+
+### A6. Resumable upload wire shape
+
+API exposes upload sessions: `POST /uploads`, `PATCH /uploads/{id}` with
+range, `POST /uploads/{id}/complete`. v1 implementation may still be
+single-shot multipart under the hood, but the resource shape exists so
+chunked / resumable upload is additive.
+
+- Enables: streaming, retry-safe ingestion, very-large-package support.
+- v1 cost: route definitions only; underlying logic can remain simple.
+
+### A7. Tiering as a property of storage locations
+
+`storage_location` carries `retrieval_tier` (`hot|warm|cold|archive`) and
+`restore_status` columns, nullable, default `hot`. The API can already
+return "not immediately available" without changing artifact identity.
+
+- Enables: future cold storage, Glacier-style restore flows.
+- v1 cost: two nullable columns.
+
+### A8. Schema-typed metadata with open escape hatch
+
+Producers register a metadata schema (JSON Schema) per variant
+(e.g. `guide-board.run.v1`). Stored as open JSON, validated against the
+registered schema at ingest time. Queries can use typed views.
+
+- Enables: tooling, search, GraphQL views, typed clients without losing
+  flexibility.
+- v1 cost: a `metadata_schemas` table; v1 validation can be a no-op.
+
+### A9. OCI compatibility kept reachable
+
+We do not promise OCI compatibility in v1, but we do not adopt any
+data model that prevents it. Concretely: keep content addresses as
+`<algo>:<hex>`, keep manifest structure compatible with an OCI image
+manifest (config + layers + annotations), and avoid invariants that the
+OCI spec forbids.
+
+- Enables: future `oras push` / `cosign sign` / Helm ecosystem entry.
+- v1 cost: one design review per schema change against the OCI spec.
+
+## Near-horizon technical roadmap (post-baseline)
+
+Roughly ordered. Not commitments; planning hooks.
+
+1. **Rust data-plane binary.** Receives chunked uploads, runs BLAKE3 + CDC +
+   optional Zstd + optional AES-GCM, writes to storage adapter. Speaks a
+   minimal gRPC or framed-bincode protocol to the Python control plane over
+   a Unix socket.
+2. **Content-defined chunking (FastCDC).** Stored chunks become the dedup
+   unit. Package manifest references chunk digests; package digest is the
+   Merkle root.
+3. **Cosign-compatible signing pipeline.** Every finalised manifest can be
+   signed; signatures stored alongside the manifest.
+4. **Event stream out.** NATS or Kafka topic of registry events for
+   downstream consumers.
+5. **OCI artifact endpoint.** A `/v2/` namespace that speaks the OCI
+   distribution spec on top of the same storage.
+6. **WASM plugin host.** Producers and operators can ship signed `.wasm`
+   modules for content extraction, redaction, scorecard generation,
+   custom hashing, indexing. This is the "ffmpeg moment" — open extension
+   surface that does not require forking the core.
+7. **Federation.** Signed manifest exchange between artifact-store
+   instances. Gossip or explicit peering.
+8. **Cold tier adapters.** S3 Glacier, Tape, IA classes.
+
+## How this document is used
+
+- Every schema change in WP-0001 (or successors) is checked against
+  commitments A1–A9. A change that violates one is either rejected or
+  documented as a deliberate revision of this document.
+- Every "we could do this faster in native code" idea is filed against
+  `docs/ASSEMBLY-EXPERIMENT.md`, not bolted onto a workplan.
+- Every new producer integration is checked against the commercial
+  horizon: does it generalise, or does it bake in producer-specific
+  assumptions?
+
+This document is allowed to be wrong. It is not allowed to be silent.
+Update it when the thesis changes; do not let v1 quietly close doors that
+the v3 needs open.
--- a/docs/REVIEW-2026-05-15-intent-and-blueprint.md
+++ b/docs/REVIEW-2026-05-15-intent-and-blueprint.md
@@ -0,0 +1,175 @@
+# Review — INTENT and Architecture Blueprint
+
+Date: 2026-05-15
+Reviewer: claude (opus-4-7)
+Inputs: `INTENT.md`, `docs/ARCHITECTURE-BLUEPRINT.md`,
+`workplans/ARTIFACT-STORE-WP-0001-service-baseline.md`, `SCOPE.md`, `AGENTS.md`
+
+This review reframes the current scoped-internal-service design against a
+longer-horizon ambition: make `artifact-store` the leading open source
+substrate for generic artifact storage in the same sense that VLC and ffmpeg
+lead their domain. See `docs/PLATFORM-AMBITION.md` for the ambition framing
+this review is in service of.
+
+## SWOT
+
+### Strengths
+
+- Clean separation between artifact *identity / lifecycle* and *bytes*.
+  Registry owns metadata; storage adapter owns persistence. This is the single
+  most consequential architectural decision and the docs get it right.
+- Retention is a first-class concept from day one, not bolted on later.
+- Audit log designed in from the start, with explicit room for signed events.
+- Storage adapter contract is minimal and well-shaped
+  (`put / get / head / delete / health`).
+- Pilot-first discipline (`guide-board` / OpenCMIS TCK) anchors the work in a
+  real producer rather than a hypothetical one.
+- Manifest portability is an explicit goal — a package should be understandable
+  without calling its producer.
+- Boundary statements are explicit (will not replace StateHub, will not encode
+  producer semantics).
+
+### Weaknesses
+
+- Storage is keyed by logical path, not by content hash. Blocks global
+  deduplication, Merkle integrity proofs, partial replication, federation.
+- No streaming, chunked, or resumable upload story. Multipart REST will cap
+  throughput at the slowest Python/WSGI hop for multi-GB packages.
+- No content-defined chunking (CDC). Evidence packages with logs are highly
+  dedup-able; current design captures none of that.
+- SHA-256 is the right *compatibility* digest but the wrong *throughput*
+  digest at platform scale.
+- Single-writer SQLite is a real concurrency ceiling; PostgreSQL helps but no
+  partitioning / sharding story exists.
+- No event / CDC stream for downstream consumers — Statehub, search, UIs would
+  have to poll.
+- No signing / attestation story (Sigstore, in-toto, SLSA). Evidence storage
+  without signed attestations leaves half the value on the table.
+- Metadata is open-ended JSON without a schema-registration path. Hard to
+  build typed tooling on top.
+- No multi-tenancy, quota, or rate-limiting primitives. Painful to retrofit.
+- No observability targets (latency / throughput SLOs, metrics, traces).
+  Platform-grade claims will eventually require numbers.
+- No OCI / `oras` artifact compatibility — leaves the largest existing
+  artifact ecosystem off the table.
+
+### Opportunities
+
+- **OCI Artifact + ORAS compatibility.** Inherit Helm, ML model, SBOM, cosign
+  tooling for free. Probably the single highest-leverage external move.
+- **Sigstore + in-toto + SLSA.** Evidence packages should be signed by
+  default; this is exactly the gap most generic registries leave unfilled.
+- **Content-addressed CAS + Merkle DAG** (Git / IPFS / restic pattern):
+  enables global dedup, integrity proofs, federation, partial mirroring.
+- **BLAKE3** as native digest with SHA-256 retained for interop:
+  orders-of-magnitude faster hashing, and BLAKE3's construction *is* a Merkle
+  tree — package-level integrity comes for free.
+- **WASM plugin surface for transforms, extractors, indexers, redactors.**
+  The "ffmpeg moment" for this domain: a stable host API that ecosystem
+  contributors can extend without forking the core.
+- **Federation / mirroring** between artifact-store instances via signed
+  manifests. Nothing comparable exists in the evidence space today.
+- **FUSE / NFS / S3-gateway frontends.** Legacy producers ingest without code
+  changes.
+- **Embeddable mode.** A single static binary like `restic`, plus a server
+  mode. Embedding is what makes ffmpeg ubiquitous.
+
+### Threats
+
+- Crowded adjacency: MinIO, Pulp, Harbor / Zot, Artifactory / Nexus, restic,
+  IPFS, Sigstore, plain S3. None are exactly this, but each chips at the
+  value proposition.
+- Scope creep vs the carefully-scoped INTENT. The platform ambition pulls
+  toward "do everything"; the INTENT pulls toward "ship the pilot." Resolve
+  this tension explicitly or you get neither.
+- Python performance ceiling on the data plane (ingestion of multi-GB
+  packages, hashing, chunking).
+- Governance / maintenance debt. VLC and ffmpeg have decades of contributor
+  depth; underestimating that is a project-killer.
+
+## Architecture optimizations worth taking now
+
+Each of these is cheap to lock in before code lands, and expensive (or
+breaking) to add later.
+
+1. **Split control plane from data plane.** Registry / API / retention stays
+   in Python with PostgreSQL. Ingestion + hashing + storage I/O becomes a
+   separate process (Rust sidecar, eventually with hot kernels in C / asm)
+   that can scale and be rewritten independently. Pin the contract now (Unix
+   socket, gRPC or framed bincode). See `docs/PLATFORM-AMBITION.md`.
+2. **Make content the primary address.** Internal object key
+   `blake3:<digest>` (or `sha256:<digest>` for compat). `relative_path`
+   becomes logical metadata in the manifest. Unlocks dedup, integrity,
+   federation, OCI compatibility.
+3. **Append-only WAL as the source of truth.** Metadata DB is a materialized
+   view rebuildable from the log. Same pattern as Kafka / EventStore /
+   Datomic. Cheap audit, replication, point-in-time recovery.
+4. **OCI artifact spec as a wire format**, even if the native API is richer.
+   Buys instant interop with `oras`, `cosign`, `crane`, Helm.
+5. **Signed manifests from day one.** Pin a signing format (cosign / Sigstore)
+   and a canonicalization (JCS or canonical CBOR). Post-hoc signing means
+   every legacy manifest is unsigned forever.
+6. **Resumable, chunked uploads on the wire.** Upload session resource
+   (`POST /uploads` → `PATCH /uploads/{id}` ranges → `POST /uploads/{id}/complete`).
+   `tus.io` is a reasonable reference. v1 implementation can still be
+   single-shot multipart.
+7. **Event stream out.** A monotonic-sequence `events` table; consumers
+   tail via long-poll, NATS, or Kafka. Trivial to add now, expensive later.
+8. **Schema-typed metadata with escape hatch.** Producers register a JSON
+   Schema for their metadata variant (`guide-board.run.v1`). Stored as open
+   JSON, validated at ingest, queryable by typed views.
+9. **Tiering as a first-class column of `storage_location`.** Promote
+   `retrieval_tier` and `restore_status` into the schema now (nullable,
+   default `hot`).
+10. **Ship a great CLI before any UI.** ffmpeg ships a binary, not a GUI.
+
+## Performance hotspots — where native code actually matters
+
+Ranked by realistic impact for this workload. Adopting libraries that already
+contain hand-tuned assembly is the cheap path; writing fresh assembly is an
+explicit research line — see `docs/ASSEMBLY-EXPERIMENT.md`.
+
+1. **Hashing (dominant ingest cost).** SHA-256 with SHA-NI: ~1.5–2 GB/s/core.
+   BLAKE3 with AVX-512: 6–10+ GB/s/core, parallelizable, free Merkle tree.
+   Adopt BLAKE3 as native; retain SHA-256 for SLSA / OCI interop.
+2. **Content-defined chunking (FastCDC / Gear).** Rolling hash over every
+   byte; pure-Python is unusable, optimized C / Rust hits 5–10 GB/s.
+   Mandatory if dedup is on the roadmap.
+3. **Compression.** Zstd with bundled SIMD reaches multi-GB/s. Evidence logs
+   typically compress 5–20×. Apply at chunk level so dedup still works.
+4. **I/O path.** Linux: `io_uring` for ingest writes; `sendfile(2)` /
+   `splice(2)` for download zero-copy; `O_DIRECT` for very large objects.
+5. **Encryption.** AES-GCM with AES-NI: ~5 GB/s/core. ChaCha20-Poly1305
+   vector implementations for non-AES-NI hardware. Use Ring, BoringSSL, or
+   AWS-LC. Never write crypto by hand.
+6. **Metadata hot paths.** Bloom or Cuckoo filter in front of the
+   "have I seen this hash?" lookup. ~50 lines of Rust, ~100× win.
+7. **Manifest canonicalization.** Signed manifests canonicalize on every
+   ingest and every verify. Pick a fast canonical CBOR / JCS impl.
+
+Not worth native code: HTTP layer, retention engine, audit log, DB access,
+orchestration, workflow logic. Keep Python.
+
+## Concrete suggestions before WP-0001 lands
+
+- Add `digest_algorithm` to `artifact_files` (default `sha256`, allow
+  `blake3`).
+- Add `content_address` (e.g., `blake3:…`) as canonical storage key, with
+  `relative_path` retained as logical metadata.
+- Add `retrieval_tier` and `restore_status` to `storage_locations` now,
+  nullable.
+- Define the upload session resource shape even if v1 implements only
+  single-shot multipart.
+- Pin a manifest canonicalization (recommend JCS or canonical CBOR) and a
+  signing format target (cosign / Sigstore). Decide, do not implement.
+- Add an `events` table with a monotonic sequence number so a CDC feed is
+  trivial later.
+- Decide explicitly whether OCI artifact compatibility is a v2 goal or out of
+  scope. Either is fine; ambiguity will distort schema decisions.
+
+## What this review does not change
+
+INTENT and SCOPE remain correctly scoped for v1. The pilot path through
+WP-0001 should ship as planned. The schema annotations above are additive,
+not redirective. The platform ambition lives in `docs/PLATFORM-AMBITION.md`
+so it can guide later decisions without expanding the current workplan.