# INTENT ## Project Name `artifact-store` ## Purpose `artifact-store` is a generic artifact registry and storage gateway. It gives projects a stable place to register files, evidence packages, logs, reports, snapshots, exports, and other generated outputs without forcing every producer to invent its own retention, indexing, and storage rules. The service owns artifact identity, metadata, provenance, retention decisions, lookup, and audit trails. Actual bytes are delegated to one or more configured storage backends such as a local filesystem, S3-compatible object storage, Ceph RGW, AWS S3, Azure Blob Storage, Google Cloud Storage, or future archival tiers. ## Product Thesis Generated artifacts become valuable when they are findable, attributable, retained for the right amount of time, and safely discardable when they are no longer needed. Teams should be able to preserve a run result, point Statehub or another system at its durable registry record, and later prove which files were stored, which hashes they had, where they lived, and when retention was extended or released. `artifact-store` exists to make artifact preservation a shared platform concern instead of an ad hoc directory convention. ## Primary Use Case Given a producer such as `guide-board`, a completed assessment run, and an artifact package directory, `artifact-store` should: 1. register the package and its files, 2. compute and store content hashes and sizes, 3. capture producer, subject, run, repository, commit, and environment metadata, 4. select the applicable retention rule, 5. write files through a configured storage backend, 6. record all storage locations and backend object keys, 7. provide stable retrieval metadata and download links, 8. allow retention extension or hold decisions, 9. expose enough index data for Statehub, release records, and future UIs, 10. make expired artifacts eligible for deletion through an auditable process. The first concrete pilot is preserving `guide-board` / `open-cmis-tck` assessment output for `kontextual-engine`. ## Intended Users - Assessment and compliance tools that produce evidence packages. - Build, release, and quality systems that need durable generated outputs. - Statehub and repository automation that need to link work records to preserved evidence. - Operators who need retention visibility and controlled deletion. - Future UI and agent workflows that need artifact search, download, or restore status. ## Core Concepts - Artifact package: a logical collection of files registered together, such as a guide-board assessment run directory. - Artifact file: one stored file with a path, media type, size, digest, and storage location. - Registry record: metadata and lifecycle state for an artifact package or file. - Storage backend: a configured adapter that stores and retrieves bytes. - Storage location: a backend-specific pointer such as a bucket/key, filesystem path, or future archive locator. - Retention class: a named policy category such as transient, raw-evidence, release-evidence, audit-prep, or permanent-record. - Retention rule: the default storage duration and deletion behavior for a class. - Retention extension: a time-bounded extension of an artifact's expiry date. - Hold: a stronger instruction that prevents deletion until explicitly released. - Retrieval tier: a future storage or access class such as hot, warm, cold, or archived. ## Scope In scope: - metadata registry for artifact packages and files, - content hashing and manifest generation, - pluggable storage backend interface, - local filesystem backend for development, - S3-compatible backend suitable for Ceph RGW, - default retention classes and expiry calculation, - retention extension and hold records, - retrieval metadata and download path generation, - audit events for ingestion, retrieval, retention changes, and deletion, - API-first service suitable for automation, - pilot integration with guide-board assessment runs. Out of scope for the initial service: - replacing Statehub as the work, repository, or decision system of record, - embedding guide-board-specific assessment semantics in the registry core, - full compliance certification or legal-record guarantees, - cloud-provider-specific lifecycle automation beyond backend adapter hooks, - asynchronous cold-archive restore flows, - user-facing UI beyond API contracts and minimal operator docs. ## Relationship To Other Services `artifact-store` should remain a shared infrastructure service. - `guide-board` produces assessment packages and asks `artifact-store` to preserve them. - `open-cmis-tck` can add CMIS-specific scorecards and log reviews before a guide-board run is ingested. - `Statehub` records work, decisions, repository state, and links to artifact registry identifiers. - Ceph is a strong self-hosted storage backend candidate because its RGW layer is S3-compatible, but the registry must not be Ceph-only. ## Boundary The registry can prove what it stored, where it stored it, which hashes it computed, and which retention decisions were applied. It does not prove the truth of the artifact contents, certify a system, or replace formal records management without additional governance.