# Archive Integration With artifact-store `infospace-bench` is an application workspace for *live* infospaces. The working state lives in a local folder and is read-write-read-write across many sessions. Durable, content-addressed preservation of finalized snapshots is delegated to [`artifact-store`](file:///home/worsch/artifact-store), which owns identity, manifests, retention policy, audit, and pluggable storage backends (local FS today, S3-compatible / Ceph RGW in artifact-store WP-0004). This document is the operator-facing companion to workplan [`IB-WP-0014`](../workplans/IB-WP-0014-infospace-backend-abstraction.md). ## When to archive Archive an infospace when: - A milestone has been reached (pilot complete, evaluations stable). - The infospace will be referenced from another system (StateHub linkage, release notes, audit evidence). - You want a recoverable point-in-time snapshot before a destructive change. - You need to share an exact, hash-verifiable copy of the state with someone else. Do **not** archive as a substitute for normal save / commit. Each archive creates a new immutable package; long sequences of archives without intent will inflate the local store. Use git for in-flight working state. ## What gets archived By default, the archive includes: - `infospace.yaml` - `artifacts/` - `workflows/` - `output/` (metrics, evaluations, run records, memory traces, ...) - `reports/` - `exports/` Always excluded: - `output/archives/.store/` (the artifact-store data dir — would cause recursive capture) - `output/archives/index.yaml` (the archive record index itself is a local pointer, not part of the preserved snapshot) Override the include / exclude sets with `--include` and `--exclude` (repeatable). Both accept relative paths or globs. ## Retention class `artifact-store` ships these retention classes: | Class | Typical use | |-----------------------|--------------------------------------------| | `transient` | Scratch outputs you only need briefly | | `raw-evidence` | Untriaged raw run output | | `summary-evidence` | Aggregated metrics / reports | | `release-evidence` | Snapshots tied to a release or milestone | | `permanent-record` | Never expires | The infospace-bench default is `release-evidence`. Override with `--retention-class`. Run `artifactstore retention sweep` from the `artifact-store` repo to mark expired packages eligible for deletion. ## CLI usage ```bash # Archive the current infospace (default include set) infospace-bench archive infospaces/agentic-memory-profile-pilot \ --note "Memory profile pilot v1 frozen" # Custom include set infospace-bench archive infospaces/lefevre \ --include reports --include exports --include infospace.yaml \ --retention-class summary-evidence # List recorded archives infospace-bench archive-list infospaces/agentic-memory-profile-pilot # List with current retention state (eligibility, holds, expiry) infospace-bench archive-list infospaces/agentic-memory-profile-pilot \ --with-retention # Restore an archive into a new directory infospace-bench restore \ --target /tmp/restored-infospace \ --from infospaces/agentic-memory-profile-pilot ``` ## Storage location By default, each infospace gets its own self-contained artifact-store under `/output/archives/.store/`: ``` output/archives/ index.yaml # human-readable archive record list .store/ registry.sqlite # artifact-store event log + materialised views storage/ blake3/ ab/ cd/ abcdef... # content-addressed bytes ``` To point a different artifact-store deployment (shared host, separate volume), pass `--store-root` or run a shared artifact-store service and pass its CLI / library handle in code. Future improvement: respect the standard `ARTIFACTSTORE_*` environment variables so an operator can point any infospace at a shared deployment without code changes. Today the in-process helper builds a self-contained store; an `artifactstore.app.build_registry()` adapter for that env-driven path is a small follow-up. ## Credentials policy - Never write secrets (API keys, S3 access keys) into `infospace.yaml` or archive metadata. Archive metadata is part of the immutable manifest. - Backend secrets live with the artifact-store deployment (`ARTIFACTSTORE_S3_ACCESS_KEY_REF=env:NAME` or `file:/run/secrets/...`) — never inside the infospace. ## Round-trip guarantees - `restore_archive` re-materializes every file recorded in the package's manifest into the target directory, byte-equivalent to the originals. - The manifest digest (`blake3:`) returned by `archive` is the stable external identifier; it survives store relocations. - Restoration refuses to overwrite a non-empty target unless `--force` is passed. Pre-existing files not in the manifest are left in place. ## What this is not - Not a replacement for the local working folder during active work. - Not a sync / replication channel between hosts. Use git or artifact-store's S3 backend (artifact-store WP-0004) for that. - Not a backup strategy. Backups are an operations concern at the artifact-store deployment level. - Not an S3 or git client inside `infospace-bench`. Those backends live in `artifact-store`. ## Related workplans - [`IB-WP-0014`](../workplans/IB-WP-0014-infospace-backend-abstraction.md) — this integration. - [`IB-WP-0013`](../workplans/IB-WP-0013-wealth-vsm-generation-pipeline-parity.md) — generation parity on the local working folder (archives capture its outputs). - `artifact-store` WP-0004 — S3-compatible / Ceph RGW backend; pointing infospace-bench archives at S3 will be an artifact-store configuration change only.