Round out IB-WP-0014 with the remaining archive operations and docs. - restore_archive() and `infospace-bench restore <pkg> --target <dir>` round-trip a finalized package's bytes back to disk. Refuses to overwrite a non-empty target unless --force. --from <infospace-root> resolves the store location. - archive-list CLI with --with-retention flag; annotate_retention() opens the per-infospace registry and joins each record with its current retention state (effective class, expires, holds, eligibility). - docs/archive-integration.md covers when to archive, the include set, retention classes, storage layout, credentials policy, and the explicit non-goal that S3/git backends live in artifact-store. - SCOPE.md cross-links the new doc. - Workplan flipped to status: done. Full pytest suite: 72 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5.8 KiB
Archive Integration With artifact-store
infospace-bench is an application workspace for live infospaces. The
working state lives in a local folder and is read-write-read-write across many
sessions. Durable, content-addressed preservation of finalized snapshots is
delegated to artifact-store, which
owns identity, manifests, retention policy, audit, and pluggable storage
backends (local FS today, S3-compatible / Ceph RGW in artifact-store WP-0004).
This document is the operator-facing companion to workplan
IB-WP-0014.
When to archive
Archive an infospace when:
- A milestone has been reached (pilot complete, evaluations stable).
- The infospace will be referenced from another system (StateHub linkage, release notes, audit evidence).
- You want a recoverable point-in-time snapshot before a destructive change.
- You need to share an exact, hash-verifiable copy of the state with someone else.
Do not archive as a substitute for normal save / commit. Each archive creates a new immutable package; long sequences of archives without intent will inflate the local store. Use git for in-flight working state.
What gets archived
By default, the archive includes:
infospace.yamlartifacts/workflows/output/(metrics, evaluations, run records, memory traces, ...)reports/exports/
Always excluded:
output/archives/.store/(the artifact-store data dir — would cause recursive capture)output/archives/index.yaml(the archive record index itself is a local pointer, not part of the preserved snapshot)
Override the include / exclude sets with --include and --exclude
(repeatable). Both accept relative paths or globs.
Retention class
artifact-store ships these retention classes:
| Class | Typical use |
|---|---|
transient |
Scratch outputs you only need briefly |
raw-evidence |
Untriaged raw run output |
summary-evidence |
Aggregated metrics / reports |
release-evidence |
Snapshots tied to a release or milestone |
permanent-record |
Never expires |
The infospace-bench default is release-evidence. Override with
--retention-class. Run artifactstore retention sweep from the
artifact-store repo to mark expired packages eligible for deletion.
CLI usage
# Archive the current infospace (default include set)
infospace-bench archive infospaces/agentic-memory-profile-pilot \
--note "Memory profile pilot v1 frozen"
# Custom include set
infospace-bench archive infospaces/lefevre \
--include reports --include exports --include infospace.yaml \
--retention-class summary-evidence
# List recorded archives
infospace-bench archive-list infospaces/agentic-memory-profile-pilot
# List with current retention state (eligibility, holds, expiry)
infospace-bench archive-list infospaces/agentic-memory-profile-pilot \
--with-retention
# Restore an archive into a new directory
infospace-bench restore <package-id> \
--target /tmp/restored-infospace \
--from infospaces/agentic-memory-profile-pilot
Storage location
By default, each infospace gets its own self-contained artifact-store under
<infospace>/output/archives/.store/:
output/archives/
index.yaml # human-readable archive record list
.store/
registry.sqlite # artifact-store event log + materialised views
storage/
blake3/
ab/
cd/
abcdef... # content-addressed bytes
To point a different artifact-store deployment (shared host, separate
volume), pass --store-root or run a shared artifact-store service and pass
its CLI / library handle in code. Future improvement: respect the standard
ARTIFACTSTORE_* environment variables so an operator can point any
infospace at a shared deployment without code changes. Today the in-process
helper builds a self-contained store; an artifactstore.app.build_registry()
adapter for that env-driven path is a small follow-up.
Credentials policy
- Never write secrets (API keys, S3 access keys) into
infospace.yamlor archive metadata. Archive metadata is part of the immutable manifest. - Backend secrets live with the artifact-store deployment
(
ARTIFACTSTORE_S3_ACCESS_KEY_REF=env:NAMEorfile:/run/secrets/...) — never inside the infospace.
Round-trip guarantees
restore_archivere-materializes every file recorded in the package's manifest into the target directory, byte-equivalent to the originals.- The manifest digest (
blake3:<hex>) returned byarchiveis the stable external identifier; it survives store relocations. - Restoration refuses to overwrite a non-empty target unless
--forceis passed. Pre-existing files not in the manifest are left in place.
What this is not
- Not a replacement for the local working folder during active work.
- Not a sync / replication channel between hosts. Use git or artifact-store's S3 backend (artifact-store WP-0004) for that.
- Not a backup strategy. Backups are an operations concern at the artifact-store deployment level.
- Not an S3 or git client inside
infospace-bench. Those backends live inartifact-store.
Related workplans
IB-WP-0014— this integration.IB-WP-0013— generation parity on the local working folder (archives capture its outputs).artifact-storeWP-0004 — S3-compatible / Ceph RGW backend; pointing infospace-bench archives at S3 will be an artifact-store configuration change only.