generated from coulomb/repo-seed
122 lines
5.1 KiB
Markdown
122 lines
5.1 KiB
Markdown
# INTENT
|
|
|
|
## Project Name
|
|
|
|
`artifact-store`
|
|
|
|
## Purpose
|
|
|
|
`artifact-store` is a generic artifact registry and storage gateway. It gives
|
|
projects a stable place to register files, evidence packages, logs, reports,
|
|
snapshots, exports, and other generated outputs without forcing every producer
|
|
to invent its own retention, indexing, and storage rules.
|
|
|
|
The service owns artifact identity, metadata, provenance, retention decisions,
|
|
lookup, and audit trails. Actual bytes are delegated to one or more configured
|
|
storage backends such as a local filesystem, S3-compatible object storage, Ceph
|
|
RGW, AWS S3, Azure Blob Storage, Google Cloud Storage, or future archival tiers.
|
|
|
|
## Product Thesis
|
|
|
|
Generated artifacts become valuable when they are findable, attributable,
|
|
retained for the right amount of time, and safely discardable when they are no
|
|
longer needed. Teams should be able to preserve a run result, point Statehub or
|
|
another system at its durable registry record, and later prove which files were
|
|
stored, which hashes they had, where they lived, and when retention was extended
|
|
or released.
|
|
|
|
`artifact-store` exists to make artifact preservation a shared platform concern
|
|
instead of an ad hoc directory convention.
|
|
|
|
## Primary Use Case
|
|
|
|
Given a producer such as `guide-board`, a completed assessment run, and an
|
|
artifact package directory, `artifact-store` should:
|
|
|
|
1. register the package and its files,
|
|
2. compute and store content hashes and sizes,
|
|
3. capture producer, subject, run, repository, commit, and environment metadata,
|
|
4. select the applicable retention rule,
|
|
5. write files through a configured storage backend,
|
|
6. record all storage locations and backend object keys,
|
|
7. provide stable retrieval metadata and download links,
|
|
8. allow retention extension or hold decisions,
|
|
9. expose enough index data for Statehub, release records, and future UIs,
|
|
10. make expired artifacts eligible for deletion through an auditable process.
|
|
|
|
The first concrete pilot is preserving `guide-board` / `open-cmis-tck`
|
|
assessment output for `kontextual-engine`.
|
|
|
|
## Intended Users
|
|
|
|
- Assessment and compliance tools that produce evidence packages.
|
|
- Build, release, and quality systems that need durable generated outputs.
|
|
- Statehub and repository automation that need to link work records to
|
|
preserved evidence.
|
|
- Operators who need retention visibility and controlled deletion.
|
|
- Future UI and agent workflows that need artifact search, download, or restore
|
|
status.
|
|
|
|
## Core Concepts
|
|
|
|
- Artifact package: a logical collection of files registered together, such as a
|
|
guide-board assessment run directory.
|
|
- Artifact file: one stored file with a path, media type, size, digest, and
|
|
storage location.
|
|
- Registry record: metadata and lifecycle state for an artifact package or file.
|
|
- Storage backend: a configured adapter that stores and retrieves bytes.
|
|
- Storage location: a backend-specific pointer such as a bucket/key, filesystem
|
|
path, or future archive locator.
|
|
- Retention class: a named policy category such as transient, raw-evidence,
|
|
release-evidence, audit-prep, or permanent-record.
|
|
- Retention rule: the default storage duration and deletion behavior for a class.
|
|
- Retention extension: a time-bounded extension of an artifact's expiry date.
|
|
- Hold: a stronger instruction that prevents deletion until explicitly released.
|
|
- Retrieval tier: a future storage or access class such as hot, warm, cold, or
|
|
archived.
|
|
|
|
## Scope
|
|
|
|
In scope:
|
|
|
|
- metadata registry for artifact packages and files,
|
|
- content hashing and manifest generation,
|
|
- pluggable storage backend interface,
|
|
- local filesystem backend for development,
|
|
- S3-compatible backend suitable for Ceph RGW,
|
|
- default retention classes and expiry calculation,
|
|
- retention extension and hold records,
|
|
- retrieval metadata and download path generation,
|
|
- audit events for ingestion, retrieval, retention changes, and deletion,
|
|
- API-first service suitable for automation,
|
|
- pilot integration with guide-board assessment runs.
|
|
|
|
Out of scope for the initial service:
|
|
|
|
- replacing Statehub as the work, repository, or decision system of record,
|
|
- embedding guide-board-specific assessment semantics in the registry core,
|
|
- full compliance certification or legal-record guarantees,
|
|
- cloud-provider-specific lifecycle automation beyond backend adapter hooks,
|
|
- asynchronous cold-archive restore flows,
|
|
- user-facing UI beyond API contracts and minimal operator docs.
|
|
|
|
## Relationship To Other Services
|
|
|
|
`artifact-store` should remain a shared infrastructure service.
|
|
|
|
- `guide-board` produces assessment packages and asks `artifact-store` to
|
|
preserve them.
|
|
- `open-cmis-tck` can add CMIS-specific scorecards and log reviews before a
|
|
guide-board run is ingested.
|
|
- `Statehub` records work, decisions, repository state, and links to artifact
|
|
registry identifiers.
|
|
- Ceph is a strong self-hosted storage backend candidate because its RGW layer is
|
|
S3-compatible, but the registry must not be Ceph-only.
|
|
|
|
## Boundary
|
|
|
|
The registry can prove what it stored, where it stored it, which hashes it
|
|
computed, and which retention decisions were applied. It does not prove the
|
|
truth of the artifact contents, certify a system, or replace formal records
|
|
management without additional governance.
|