generated from coulomb/repo-seed
IB-WP-0014: archive-list, restore, retention annotation, docs (T03-T05)
Round out IB-WP-0014 with the remaining archive operations and docs. - restore_archive() and `infospace-bench restore <pkg> --target <dir>` round-trip a finalized package's bytes back to disk. Refuses to overwrite a non-empty target unless --force. --from <infospace-root> resolves the store location. - archive-list CLI with --with-retention flag; annotate_retention() opens the per-infospace registry and joins each record with its current retention state (effective class, expires, holds, eligibility). - docs/archive-integration.md covers when to archive, the include set, retention classes, storage layout, credentials policy, and the explicit non-goal that S3/git backends live in artifact-store. - SCOPE.md cross-links the new doc. - Workplan flipped to status: done. Full pytest suite: 72 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
1
SCOPE.md
1
SCOPE.md
@@ -84,5 +84,6 @@ The repo is intentionally above the lower layers:
|
|||||||
- Start with: `INTENT.md`, `wiki/ProductRequirementsDocument.md`,
|
- Start with: `INTENT.md`, `wiki/ProductRequirementsDocument.md`,
|
||||||
`wiki/FunctionalRequirementsSpecification.md`
|
`wiki/FunctionalRequirementsSpecification.md`
|
||||||
- Migration assessment: `docs/markitect-main-scope-assessment.md`
|
- Migration assessment: `docs/markitect-main-scope-assessment.md`
|
||||||
|
- Archive integration with `artifact-store`: `docs/archive-integration.md`
|
||||||
- Workplans: `workplans/`
|
- Workplans: `workplans/`
|
||||||
- State Hub rules: `CLAUDE.md` and `.claude/rules/`
|
- State Hub rules: `CLAUDE.md` and `.claude/rules/`
|
||||||
|
|||||||
151
docs/archive-integration.md
Normal file
151
docs/archive-integration.md
Normal file
@@ -0,0 +1,151 @@
|
|||||||
|
# Archive Integration With artifact-store
|
||||||
|
|
||||||
|
`infospace-bench` is an application workspace for *live* infospaces. The
|
||||||
|
working state lives in a local folder and is read-write-read-write across many
|
||||||
|
sessions. Durable, content-addressed preservation of finalized snapshots is
|
||||||
|
delegated to [`artifact-store`](file:///home/worsch/artifact-store), which
|
||||||
|
owns identity, manifests, retention policy, audit, and pluggable storage
|
||||||
|
backends (local FS today, S3-compatible / Ceph RGW in artifact-store WP-0004).
|
||||||
|
|
||||||
|
This document is the operator-facing companion to workplan
|
||||||
|
[`IB-WP-0014`](../workplans/IB-WP-0014-infospace-backend-abstraction.md).
|
||||||
|
|
||||||
|
## When to archive
|
||||||
|
|
||||||
|
Archive an infospace when:
|
||||||
|
|
||||||
|
- A milestone has been reached (pilot complete, evaluations stable).
|
||||||
|
- The infospace will be referenced from another system (StateHub linkage,
|
||||||
|
release notes, audit evidence).
|
||||||
|
- You want a recoverable point-in-time snapshot before a destructive change.
|
||||||
|
- You need to share an exact, hash-verifiable copy of the state with someone
|
||||||
|
else.
|
||||||
|
|
||||||
|
Do **not** archive as a substitute for normal save / commit. Each archive
|
||||||
|
creates a new immutable package; long sequences of archives without intent
|
||||||
|
will inflate the local store. Use git for in-flight working state.
|
||||||
|
|
||||||
|
## What gets archived
|
||||||
|
|
||||||
|
By default, the archive includes:
|
||||||
|
|
||||||
|
- `infospace.yaml`
|
||||||
|
- `artifacts/`
|
||||||
|
- `workflows/`
|
||||||
|
- `output/` (metrics, evaluations, run records, memory traces, ...)
|
||||||
|
- `reports/`
|
||||||
|
- `exports/`
|
||||||
|
|
||||||
|
Always excluded:
|
||||||
|
|
||||||
|
- `output/archives/.store/` (the artifact-store data dir — would cause
|
||||||
|
recursive capture)
|
||||||
|
- `output/archives/index.yaml` (the archive record index itself is a local
|
||||||
|
pointer, not part of the preserved snapshot)
|
||||||
|
|
||||||
|
Override the include / exclude sets with `--include` and `--exclude`
|
||||||
|
(repeatable). Both accept relative paths or globs.
|
||||||
|
|
||||||
|
## Retention class
|
||||||
|
|
||||||
|
`artifact-store` ships these retention classes:
|
||||||
|
|
||||||
|
| Class | Typical use |
|
||||||
|
|-----------------------|--------------------------------------------|
|
||||||
|
| `transient` | Scratch outputs you only need briefly |
|
||||||
|
| `raw-evidence` | Untriaged raw run output |
|
||||||
|
| `summary-evidence` | Aggregated metrics / reports |
|
||||||
|
| `release-evidence` | Snapshots tied to a release or milestone |
|
||||||
|
| `permanent-record` | Never expires |
|
||||||
|
|
||||||
|
The infospace-bench default is `release-evidence`. Override with
|
||||||
|
`--retention-class`. Run `artifactstore retention sweep` from the
|
||||||
|
`artifact-store` repo to mark expired packages eligible for deletion.
|
||||||
|
|
||||||
|
## CLI usage
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Archive the current infospace (default include set)
|
||||||
|
infospace-bench archive infospaces/agentic-memory-profile-pilot \
|
||||||
|
--note "Memory profile pilot v1 frozen"
|
||||||
|
|
||||||
|
# Custom include set
|
||||||
|
infospace-bench archive infospaces/lefevre \
|
||||||
|
--include reports --include exports --include infospace.yaml \
|
||||||
|
--retention-class summary-evidence
|
||||||
|
|
||||||
|
# List recorded archives
|
||||||
|
infospace-bench archive-list infospaces/agentic-memory-profile-pilot
|
||||||
|
|
||||||
|
# List with current retention state (eligibility, holds, expiry)
|
||||||
|
infospace-bench archive-list infospaces/agentic-memory-profile-pilot \
|
||||||
|
--with-retention
|
||||||
|
|
||||||
|
# Restore an archive into a new directory
|
||||||
|
infospace-bench restore <package-id> \
|
||||||
|
--target /tmp/restored-infospace \
|
||||||
|
--from infospaces/agentic-memory-profile-pilot
|
||||||
|
```
|
||||||
|
|
||||||
|
## Storage location
|
||||||
|
|
||||||
|
By default, each infospace gets its own self-contained artifact-store under
|
||||||
|
`<infospace>/output/archives/.store/`:
|
||||||
|
|
||||||
|
```
|
||||||
|
output/archives/
|
||||||
|
index.yaml # human-readable archive record list
|
||||||
|
.store/
|
||||||
|
registry.sqlite # artifact-store event log + materialised views
|
||||||
|
storage/
|
||||||
|
blake3/
|
||||||
|
ab/
|
||||||
|
cd/
|
||||||
|
abcdef... # content-addressed bytes
|
||||||
|
```
|
||||||
|
|
||||||
|
To point a different artifact-store deployment (shared host, separate
|
||||||
|
volume), pass `--store-root` or run a shared artifact-store service and pass
|
||||||
|
its CLI / library handle in code. Future improvement: respect the standard
|
||||||
|
`ARTIFACTSTORE_*` environment variables so an operator can point any
|
||||||
|
infospace at a shared deployment without code changes. Today the in-process
|
||||||
|
helper builds a self-contained store; an `artifactstore.app.build_registry()`
|
||||||
|
adapter for that env-driven path is a small follow-up.
|
||||||
|
|
||||||
|
## Credentials policy
|
||||||
|
|
||||||
|
- Never write secrets (API keys, S3 access keys) into `infospace.yaml` or
|
||||||
|
archive metadata. Archive metadata is part of the immutable manifest.
|
||||||
|
- Backend secrets live with the artifact-store deployment
|
||||||
|
(`ARTIFACTSTORE_S3_ACCESS_KEY_REF=env:NAME` or
|
||||||
|
`file:/run/secrets/...`) — never inside the infospace.
|
||||||
|
|
||||||
|
## Round-trip guarantees
|
||||||
|
|
||||||
|
- `restore_archive` re-materializes every file recorded in the package's
|
||||||
|
manifest into the target directory, byte-equivalent to the originals.
|
||||||
|
- The manifest digest (`blake3:<hex>`) returned by `archive` is the stable
|
||||||
|
external identifier; it survives store relocations.
|
||||||
|
- Restoration refuses to overwrite a non-empty target unless `--force` is
|
||||||
|
passed. Pre-existing files not in the manifest are left in place.
|
||||||
|
|
||||||
|
## What this is not
|
||||||
|
|
||||||
|
- Not a replacement for the local working folder during active work.
|
||||||
|
- Not a sync / replication channel between hosts. Use git or
|
||||||
|
artifact-store's S3 backend (artifact-store WP-0004) for that.
|
||||||
|
- Not a backup strategy. Backups are an operations concern at the
|
||||||
|
artifact-store deployment level.
|
||||||
|
- Not an S3 or git client inside `infospace-bench`. Those backends live in
|
||||||
|
`artifact-store`.
|
||||||
|
|
||||||
|
## Related workplans
|
||||||
|
|
||||||
|
- [`IB-WP-0014`](../workplans/IB-WP-0014-infospace-backend-abstraction.md) —
|
||||||
|
this integration.
|
||||||
|
- [`IB-WP-0013`](../workplans/IB-WP-0013-wealth-vsm-generation-pipeline-parity.md) —
|
||||||
|
generation parity on the local working folder (archives capture its
|
||||||
|
outputs).
|
||||||
|
- `artifact-store` WP-0004 — S3-compatible / Ceph RGW backend; pointing
|
||||||
|
infospace-bench archives at S3 will be an artifact-store configuration
|
||||||
|
change only.
|
||||||
@@ -1,4 +1,11 @@
|
|||||||
from .archive import ArchiveRecord, archive_infospace, list_archives
|
from .archive import (
|
||||||
|
ArchiveRecord,
|
||||||
|
RestoredArchive,
|
||||||
|
annotate_retention,
|
||||||
|
archive_infospace,
|
||||||
|
list_archives,
|
||||||
|
restore_archive,
|
||||||
|
)
|
||||||
from .errors import InfospaceError
|
from .errors import InfospaceError
|
||||||
from .evaluation import EntityEvaluation, EvaluationSnapshot, MetricValue, ScoreEntry
|
from .evaluation import EntityEvaluation, EvaluationSnapshot, MetricValue, ScoreEntry
|
||||||
from .engine import (
|
from .engine import (
|
||||||
@@ -43,6 +50,7 @@ from .workflow import load_workflows, plan_workflow, run_workflow
|
|||||||
__all__ = [
|
__all__ = [
|
||||||
"ArchiveRecord",
|
"ArchiveRecord",
|
||||||
"DisciplineBinding",
|
"DisciplineBinding",
|
||||||
|
"RestoredArchive",
|
||||||
"EntityEvaluation",
|
"EntityEvaluation",
|
||||||
"EvaluationSnapshot",
|
"EvaluationSnapshot",
|
||||||
"Infospace",
|
"Infospace",
|
||||||
@@ -61,6 +69,7 @@ __all__ = [
|
|||||||
"TopicConfig",
|
"TopicConfig",
|
||||||
"ViabilityThreshold",
|
"ViabilityThreshold",
|
||||||
"add_artifact",
|
"add_artifact",
|
||||||
|
"annotate_retention",
|
||||||
"append_to_history",
|
"append_to_history",
|
||||||
"archive_infospace",
|
"archive_infospace",
|
||||||
"create_infospace",
|
"create_infospace",
|
||||||
@@ -79,6 +88,7 @@ __all__ = [
|
|||||||
"read_snapshot",
|
"read_snapshot",
|
||||||
"record_check_results",
|
"record_check_results",
|
||||||
"register_artifact",
|
"register_artifact",
|
||||||
|
"restore_archive",
|
||||||
"load_workflows",
|
"load_workflows",
|
||||||
"plan_workflow",
|
"plan_workflow",
|
||||||
"run_workflow",
|
"run_workflow",
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ from __future__ import annotations
|
|||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
import fnmatch
|
import fnmatch
|
||||||
|
import json
|
||||||
import mimetypes
|
import mimetypes
|
||||||
import os
|
import os
|
||||||
from collections.abc import AsyncIterator, Iterable
|
from collections.abc import AsyncIterator, Iterable
|
||||||
@@ -20,6 +21,7 @@ from dataclasses import dataclass, field
|
|||||||
from datetime import datetime, timezone
|
from datetime import datetime, timezone
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
import yaml
|
import yaml
|
||||||
from sqlalchemy import insert, select
|
from sqlalchemy import insert, select
|
||||||
@@ -142,6 +144,77 @@ def archive_infospace(
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class RestoredArchive:
|
||||||
|
"""Result of :func:`restore_archive`."""
|
||||||
|
|
||||||
|
package_id: str
|
||||||
|
manifest_digest: str
|
||||||
|
target: str
|
||||||
|
file_count: int
|
||||||
|
restored_paths: list[str]
|
||||||
|
|
||||||
|
def to_dict(self) -> dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"package_id": self.package_id,
|
||||||
|
"manifest_digest": self.manifest_digest,
|
||||||
|
"target": self.target,
|
||||||
|
"file_count": self.file_count,
|
||||||
|
"restored_paths": list(self.restored_paths),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def restore_archive(
|
||||||
|
package_id: str,
|
||||||
|
*,
|
||||||
|
target: str | Path,
|
||||||
|
store_root: str | Path | None = None,
|
||||||
|
source_infospace: str | Path | None = None,
|
||||||
|
registry: Registry | None = None,
|
||||||
|
force: bool = False,
|
||||||
|
) -> RestoredArchive:
|
||||||
|
"""Re-materialize an archived infospace package into ``target``.
|
||||||
|
|
||||||
|
Exactly one of ``store_root``, ``source_infospace``, or ``registry`` must
|
||||||
|
locate the artifact-store. ``source_infospace`` is a convenience that
|
||||||
|
resolves to ``<source_infospace>/output/archives/.store/``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
return asyncio.run(
|
||||||
|
_restore_archive_async(
|
||||||
|
package_id=package_id,
|
||||||
|
target=Path(target),
|
||||||
|
store_root=Path(store_root) if store_root else None,
|
||||||
|
source_infospace=Path(source_infospace) if source_infospace else None,
|
||||||
|
registry=registry,
|
||||||
|
force=force,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def annotate_retention(
|
||||||
|
archives: Iterable[ArchiveRecord],
|
||||||
|
*,
|
||||||
|
store_root: str | Path | None = None,
|
||||||
|
source_infospace: str | Path | None = None,
|
||||||
|
registry: Registry | None = None,
|
||||||
|
) -> list[dict[str, Any]]:
|
||||||
|
"""Pair each record with its current retention state if reachable.
|
||||||
|
|
||||||
|
Returns a list of ``{"archive": ArchiveRecord.to_dict(), "retention": {...}|None}``
|
||||||
|
entries. Records whose registry cannot be opened get ``retention: None``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
return asyncio.run(
|
||||||
|
_annotate_retention_async(
|
||||||
|
tuple(archives),
|
||||||
|
store_root=Path(store_root) if store_root else None,
|
||||||
|
source_infospace=Path(source_infospace) if source_infospace else None,
|
||||||
|
registry=registry,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
def list_archives(root: str | Path) -> list[ArchiveRecord]:
|
def list_archives(root: str | Path) -> list[ArchiveRecord]:
|
||||||
"""Return the recorded archive entries for an infospace."""
|
"""Return the recorded archive entries for an infospace."""
|
||||||
path = Path(root) / ARCHIVE_INDEX_PATH
|
path = Path(root) / ARCHIVE_INDEX_PATH
|
||||||
@@ -323,3 +396,151 @@ async def _build_local_registry(store_root: Path) -> Registry:
|
|||||||
backend = LocalBackend(backend_root, backend_id="local")
|
backend = LocalBackend(backend_root, backend_id="local")
|
||||||
dataplane = InProcessDataPlane(backend)
|
dataplane = InProcessDataPlane(backend)
|
||||||
return Registry(engine, dataplane, RegistryViewWriter())
|
return Registry(engine, dataplane, RegistryViewWriter())
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_store_root(
|
||||||
|
*,
|
||||||
|
store_root: Path | None,
|
||||||
|
source_infospace: Path | None,
|
||||||
|
) -> Path | None:
|
||||||
|
if store_root is not None and source_infospace is not None:
|
||||||
|
raise InfospaceError(
|
||||||
|
"ambiguous_archive_store",
|
||||||
|
"Pass at most one of store_root or source_infospace",
|
||||||
|
{},
|
||||||
|
)
|
||||||
|
if store_root is not None:
|
||||||
|
return store_root
|
||||||
|
if source_infospace is not None:
|
||||||
|
return source_infospace / ARCHIVE_STORE_DIR
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
async def _restore_archive_async(
|
||||||
|
*,
|
||||||
|
package_id: str,
|
||||||
|
target: Path,
|
||||||
|
store_root: Path | None,
|
||||||
|
source_infospace: Path | None,
|
||||||
|
registry: Registry | None,
|
||||||
|
force: bool,
|
||||||
|
) -> RestoredArchive:
|
||||||
|
owned_registry = registry is None
|
||||||
|
if owned_registry:
|
||||||
|
resolved_store = _resolve_store_root(
|
||||||
|
store_root=store_root,
|
||||||
|
source_infospace=source_infospace,
|
||||||
|
)
|
||||||
|
if resolved_store is None:
|
||||||
|
raise InfospaceError(
|
||||||
|
"missing_archive_store",
|
||||||
|
"restore_archive needs registry, store_root, or source_infospace",
|
||||||
|
{},
|
||||||
|
)
|
||||||
|
if not resolved_store.exists():
|
||||||
|
raise InfospaceError(
|
||||||
|
"missing_archive_store",
|
||||||
|
f"Archive store does not exist: {resolved_store}",
|
||||||
|
{"store_root": str(resolved_store)},
|
||||||
|
)
|
||||||
|
registry = await _build_local_registry(resolved_store)
|
||||||
|
|
||||||
|
try:
|
||||||
|
assert registry is not None
|
||||||
|
pkg_uuid = UUID(package_id)
|
||||||
|
pkg = await registry.get_package(pkg_uuid)
|
||||||
|
if pkg.manifest_digest_hex is None:
|
||||||
|
raise InfospaceError(
|
||||||
|
"unfinalized_package",
|
||||||
|
f"Package is not finalized: {package_id}",
|
||||||
|
{"package_id": package_id, "status": pkg.status},
|
||||||
|
)
|
||||||
|
manifest_bytes = await registry.get_manifest_bytes(pkg_uuid, format="json")
|
||||||
|
manifest = json.loads(manifest_bytes.decode("utf-8"))
|
||||||
|
|
||||||
|
target.mkdir(parents=True, exist_ok=True)
|
||||||
|
if not force and any(target.iterdir()):
|
||||||
|
raise InfospaceError(
|
||||||
|
"restore_target_not_empty",
|
||||||
|
f"Refusing to restore into non-empty directory: {target}",
|
||||||
|
{"target": str(target)},
|
||||||
|
)
|
||||||
|
|
||||||
|
restored: list[str] = []
|
||||||
|
for entry in manifest.get("files", []):
|
||||||
|
rel = str(entry["relative_path"])
|
||||||
|
file_id = UUID(str(entry["id"]))
|
||||||
|
dest = (target / rel).resolve()
|
||||||
|
target_resolved = target.resolve()
|
||||||
|
if target_resolved not in dest.parents and dest != target_resolved:
|
||||||
|
raise InfospaceError(
|
||||||
|
"unsafe_restore_path",
|
||||||
|
f"Manifest path escapes target: {rel}",
|
||||||
|
{"target": str(target), "relative_path": rel},
|
||||||
|
)
|
||||||
|
dest.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
stream = await registry.get_file(file_id)
|
||||||
|
with dest.open("wb") as fh:
|
||||||
|
async for chunk in stream:
|
||||||
|
fh.write(chunk)
|
||||||
|
restored.append(rel)
|
||||||
|
finally:
|
||||||
|
if owned_registry and registry is not None:
|
||||||
|
await registry.dispose()
|
||||||
|
|
||||||
|
return RestoredArchive(
|
||||||
|
package_id=package_id,
|
||||||
|
manifest_digest=f"blake3:{pkg.manifest_digest_hex}",
|
||||||
|
target=str(target),
|
||||||
|
file_count=len(restored),
|
||||||
|
restored_paths=restored,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def _annotate_retention_async(
|
||||||
|
archives: tuple[ArchiveRecord, ...],
|
||||||
|
*,
|
||||||
|
store_root: Path | None,
|
||||||
|
source_infospace: Path | None,
|
||||||
|
registry: Registry | None,
|
||||||
|
) -> list[dict[str, Any]]:
|
||||||
|
if not archives:
|
||||||
|
return []
|
||||||
|
|
||||||
|
owned_registry = registry is None
|
||||||
|
used_store_root: Path | None = None
|
||||||
|
if owned_registry:
|
||||||
|
used_store_root = _resolve_store_root(
|
||||||
|
store_root=store_root,
|
||||||
|
source_infospace=source_infospace,
|
||||||
|
)
|
||||||
|
if used_store_root is None or not used_store_root.exists():
|
||||||
|
return [{"archive": rec.to_dict(), "retention": None} for rec in archives]
|
||||||
|
registry = await _build_local_registry(used_store_root)
|
||||||
|
|
||||||
|
try:
|
||||||
|
assert registry is not None
|
||||||
|
results: list[dict[str, Any]] = []
|
||||||
|
for rec in archives:
|
||||||
|
retention: dict[str, Any] | None
|
||||||
|
try:
|
||||||
|
state = await registry.get_retention_state(UUID(rec.package_id))
|
||||||
|
retention = {
|
||||||
|
"current_expires_at": (
|
||||||
|
state.current_expires_at.isoformat()
|
||||||
|
if state.current_expires_at
|
||||||
|
else None
|
||||||
|
),
|
||||||
|
"effective_class": state.effective_class,
|
||||||
|
"active_hold_id": (
|
||||||
|
str(state.active_hold_id) if state.active_hold_id else None
|
||||||
|
),
|
||||||
|
"eligible_for_deletion": state.eligible_for_deletion,
|
||||||
|
}
|
||||||
|
except Exception as exc:
|
||||||
|
retention = {"error": f"{type(exc).__name__}: {exc}"}
|
||||||
|
results.append({"archive": rec.to_dict(), "retention": retention})
|
||||||
|
return results
|
||||||
|
finally:
|
||||||
|
if owned_registry and registry is not None:
|
||||||
|
await registry.dispose()
|
||||||
|
|||||||
@@ -6,6 +6,12 @@ import json
|
|||||||
import sys
|
import sys
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
|
from .archive import (
|
||||||
|
annotate_retention,
|
||||||
|
archive_infospace,
|
||||||
|
list_archives,
|
||||||
|
restore_archive,
|
||||||
|
)
|
||||||
from .checks import run_collection_checks
|
from .checks import run_collection_checks
|
||||||
from .engine import engine_capability_contract, plan_asset_sync, sync_assets
|
from .engine import engine_capability_contract, plan_asset_sync, sync_assets
|
||||||
from .errors import InfospaceError
|
from .errors import InfospaceError
|
||||||
@@ -195,6 +201,74 @@ def build_parser() -> argparse.ArgumentParser:
|
|||||||
generate_from_source.add_argument("--max-chunks", type=int, default=0)
|
generate_from_source.add_argument("--max-chunks", type=int, default=0)
|
||||||
generate_from_source.add_argument("--apply", action="store_true")
|
generate_from_source.add_argument("--apply", action="store_true")
|
||||||
|
|
||||||
|
archive = sub.add_parser(
|
||||||
|
"archive",
|
||||||
|
help="Archive an infospace into artifact-store (durable, content-addressed)",
|
||||||
|
)
|
||||||
|
archive.add_argument("root")
|
||||||
|
archive.add_argument(
|
||||||
|
"--retention-class",
|
||||||
|
default="release-evidence",
|
||||||
|
help="artifact-store retention class id (default: release-evidence)",
|
||||||
|
)
|
||||||
|
archive.add_argument(
|
||||||
|
"--include",
|
||||||
|
action="append",
|
||||||
|
default=[],
|
||||||
|
help="Relative path to include (repeatable). Default: infospace.yaml, artifacts/, workflows/, output/, reports/, exports/",
|
||||||
|
)
|
||||||
|
archive.add_argument(
|
||||||
|
"--exclude",
|
||||||
|
action="append",
|
||||||
|
default=[],
|
||||||
|
help="Relative path or glob to exclude (repeatable)",
|
||||||
|
)
|
||||||
|
archive.add_argument("--note", default="", help="Free-text note for the archive record")
|
||||||
|
archive.add_argument(
|
||||||
|
"--store-root",
|
||||||
|
default="",
|
||||||
|
help="Override the artifact-store location (default: <root>/output/archives/.store)",
|
||||||
|
)
|
||||||
|
|
||||||
|
archive_list = sub.add_parser(
|
||||||
|
"archive-list",
|
||||||
|
help="List recorded archives for an infospace",
|
||||||
|
)
|
||||||
|
archive_list.add_argument("root")
|
||||||
|
archive_list.add_argument(
|
||||||
|
"--with-retention",
|
||||||
|
action="store_true",
|
||||||
|
help="Annotate each archive with its current retention state",
|
||||||
|
)
|
||||||
|
archive_list.add_argument(
|
||||||
|
"--store-root",
|
||||||
|
default="",
|
||||||
|
help="Override the artifact-store location for retention lookups",
|
||||||
|
)
|
||||||
|
|
||||||
|
restore = sub.add_parser(
|
||||||
|
"restore",
|
||||||
|
help="Restore an archived infospace package into a target directory",
|
||||||
|
)
|
||||||
|
restore.add_argument("package_id")
|
||||||
|
restore.add_argument("--target", required=True, help="Directory to restore into")
|
||||||
|
restore.add_argument(
|
||||||
|
"--from",
|
||||||
|
dest="from_root",
|
||||||
|
default="",
|
||||||
|
help="Source infospace whose archive store holds the package",
|
||||||
|
)
|
||||||
|
restore.add_argument(
|
||||||
|
"--store-root",
|
||||||
|
default="",
|
||||||
|
help="Direct path to the artifact-store location",
|
||||||
|
)
|
||||||
|
restore.add_argument(
|
||||||
|
"--force",
|
||||||
|
action="store_true",
|
||||||
|
help="Overwrite into a non-empty target directory",
|
||||||
|
)
|
||||||
|
|
||||||
engine = sub.add_parser("engine", help="Inspect and sync engine boundary state")
|
engine = sub.add_parser("engine", help="Inspect and sync engine boundary state")
|
||||||
engine_sub = engine.add_subparsers(dest="engine_command", required=True)
|
engine_sub = engine.add_subparsers(dest="engine_command", required=True)
|
||||||
|
|
||||||
@@ -423,6 +497,36 @@ def main(argv: list[str] | None = None) -> int:
|
|||||||
_write_json(plan_generation(infospace.root, stage=args.stage))
|
_write_json(plan_generation(infospace.root, stage=args.stage))
|
||||||
else:
|
else:
|
||||||
parser.error(f"Unhandled generate command: {args.generate_command}")
|
parser.error(f"Unhandled generate command: {args.generate_command}")
|
||||||
|
elif args.command == "archive":
|
||||||
|
record = archive_infospace(
|
||||||
|
Path(args.root),
|
||||||
|
retention_class=args.retention_class,
|
||||||
|
include=args.include or None,
|
||||||
|
exclude=args.exclude or None,
|
||||||
|
note=args.note,
|
||||||
|
store_root=args.store_root or None,
|
||||||
|
)
|
||||||
|
_write_json(record.to_dict())
|
||||||
|
elif args.command == "archive-list":
|
||||||
|
archives = list_archives(Path(args.root))
|
||||||
|
if args.with_retention:
|
||||||
|
payload = annotate_retention(
|
||||||
|
archives,
|
||||||
|
store_root=args.store_root or None,
|
||||||
|
source_infospace=Path(args.root) if not args.store_root else None,
|
||||||
|
)
|
||||||
|
_write_json({"archives": payload})
|
||||||
|
else:
|
||||||
|
_write_json({"archives": [rec.to_dict() for rec in archives]})
|
||||||
|
elif args.command == "restore":
|
||||||
|
result = restore_archive(
|
||||||
|
args.package_id,
|
||||||
|
target=Path(args.target),
|
||||||
|
store_root=args.store_root or None,
|
||||||
|
source_infospace=Path(args.from_root) if args.from_root else None,
|
||||||
|
force=args.force,
|
||||||
|
)
|
||||||
|
_write_json(result.to_dict())
|
||||||
elif args.command == "engine":
|
elif args.command == "engine":
|
||||||
if args.engine_command == "inspect":
|
if args.engine_command == "inspect":
|
||||||
_write_json(
|
_write_json(
|
||||||
|
|||||||
@@ -1,5 +1,6 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import filecmp
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
@@ -8,10 +9,13 @@ import yaml
|
|||||||
from infospace_bench import (
|
from infospace_bench import (
|
||||||
ArchiveRecord,
|
ArchiveRecord,
|
||||||
InfospaceError,
|
InfospaceError,
|
||||||
|
RestoredArchive,
|
||||||
add_artifact,
|
add_artifact,
|
||||||
|
annotate_retention,
|
||||||
archive_infospace,
|
archive_infospace,
|
||||||
create_infospace,
|
create_infospace,
|
||||||
list_archives,
|
list_archives,
|
||||||
|
restore_archive,
|
||||||
)
|
)
|
||||||
from infospace_bench.archive import (
|
from infospace_bench.archive import (
|
||||||
ARCHIVE_INDEX_PATH,
|
ARCHIVE_INDEX_PATH,
|
||||||
@@ -99,3 +103,96 @@ def test_archive_rejects_empty_include(tmp_path: Path) -> None:
|
|||||||
with pytest.raises(InfospaceError) as excinfo:
|
with pytest.raises(InfospaceError) as excinfo:
|
||||||
archive_infospace(root, include=["does-not-exist"])
|
archive_infospace(root, include=["does-not-exist"])
|
||||||
assert excinfo.value.code == "empty_archive"
|
assert excinfo.value.code == "empty_archive"
|
||||||
|
|
||||||
|
|
||||||
|
def test_restore_archive_round_trips_bytes(tmp_path: Path) -> None:
|
||||||
|
root = _seed_infospace(tmp_path)
|
||||||
|
record = archive_infospace(root, note="round trip")
|
||||||
|
|
||||||
|
target = tmp_path / "restored"
|
||||||
|
result = restore_archive(
|
||||||
|
record.package_id,
|
||||||
|
target=target,
|
||||||
|
source_infospace=root,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert isinstance(result, RestoredArchive)
|
||||||
|
assert result.manifest_digest == record.manifest_digest
|
||||||
|
assert result.file_count == record.file_count
|
||||||
|
|
||||||
|
for rel in result.restored_paths:
|
||||||
|
original = root / rel
|
||||||
|
restored = target / rel
|
||||||
|
assert restored.is_file()
|
||||||
|
assert filecmp.cmp(original, restored, shallow=False), rel
|
||||||
|
|
||||||
|
|
||||||
|
def test_restore_archive_refuses_non_empty_target(tmp_path: Path) -> None:
|
||||||
|
root = _seed_infospace(tmp_path)
|
||||||
|
record = archive_infospace(root)
|
||||||
|
|
||||||
|
target = tmp_path / "filled"
|
||||||
|
target.mkdir()
|
||||||
|
(target / "existing.txt").write_text("hi", encoding="utf-8")
|
||||||
|
|
||||||
|
with pytest.raises(InfospaceError) as excinfo:
|
||||||
|
restore_archive(
|
||||||
|
record.package_id,
|
||||||
|
target=target,
|
||||||
|
source_infospace=root,
|
||||||
|
)
|
||||||
|
assert excinfo.value.code == "restore_target_not_empty"
|
||||||
|
|
||||||
|
|
||||||
|
def test_restore_archive_force_overwrites_non_empty_target(tmp_path: Path) -> None:
|
||||||
|
root = _seed_infospace(tmp_path)
|
||||||
|
record = archive_infospace(root)
|
||||||
|
|
||||||
|
target = tmp_path / "filled-force"
|
||||||
|
target.mkdir()
|
||||||
|
(target / "leftover.txt").write_text("old", encoding="utf-8")
|
||||||
|
|
||||||
|
result = restore_archive(
|
||||||
|
record.package_id,
|
||||||
|
target=target,
|
||||||
|
source_infospace=root,
|
||||||
|
force=True,
|
||||||
|
)
|
||||||
|
assert result.file_count == record.file_count
|
||||||
|
# Pre-existing files that are not in the manifest are left in place.
|
||||||
|
assert (target / "leftover.txt").read_text(encoding="utf-8") == "old"
|
||||||
|
|
||||||
|
|
||||||
|
def test_restore_archive_requires_store_location(tmp_path: Path) -> None:
|
||||||
|
with pytest.raises(InfospaceError) as excinfo:
|
||||||
|
restore_archive("00000000-0000-0000-0000-000000000000", target=tmp_path)
|
||||||
|
assert excinfo.value.code == "missing_archive_store"
|
||||||
|
|
||||||
|
|
||||||
|
def test_annotate_retention_returns_state_for_each_archive(tmp_path: Path) -> None:
|
||||||
|
root = _seed_infospace(tmp_path)
|
||||||
|
first = archive_infospace(root)
|
||||||
|
second = archive_infospace(root)
|
||||||
|
|
||||||
|
archives = list_archives(root)
|
||||||
|
annotated = annotate_retention(archives, source_infospace=root)
|
||||||
|
|
||||||
|
assert [item["archive"]["package_id"] for item in annotated] == [
|
||||||
|
first.package_id,
|
||||||
|
second.package_id,
|
||||||
|
]
|
||||||
|
for item in annotated:
|
||||||
|
retention = item["retention"]
|
||||||
|
assert retention is not None
|
||||||
|
assert retention["effective_class"] == DEFAULT_RETENTION_CLASS
|
||||||
|
assert retention["eligible_for_deletion"] is False
|
||||||
|
|
||||||
|
|
||||||
|
def test_annotate_retention_returns_none_when_store_missing(tmp_path: Path) -> None:
|
||||||
|
root = _seed_infospace(tmp_path)
|
||||||
|
archive_infospace(root, store_root=tmp_path / "external-store")
|
||||||
|
|
||||||
|
archives = list_archives(root)
|
||||||
|
# Source infospace's store doesn't exist (we overrode store_root)
|
||||||
|
annotated = annotate_retention(archives, source_infospace=root)
|
||||||
|
assert annotated[0]["retention"] is None
|
||||||
|
|||||||
@@ -57,3 +57,53 @@ def test_cli_returns_structured_error(tmp_path: Path) -> None:
|
|||||||
assert result.returncode == 2
|
assert result.returncode == 2
|
||||||
payload = json.loads(result.stderr)
|
payload = json.loads(result.stderr)
|
||||||
assert payload["error"]["code"] == "missing_infospace"
|
assert payload["error"]["code"] == "missing_infospace"
|
||||||
|
|
||||||
|
|
||||||
|
def test_cli_archive_list_and_restore(tmp_path: Path) -> None:
|
||||||
|
create = run_cli(
|
||||||
|
"create",
|
||||||
|
str(tmp_path),
|
||||||
|
"cli-archive",
|
||||||
|
"--name",
|
||||||
|
"CLI Archive",
|
||||||
|
)
|
||||||
|
assert create.returncode == 0, create.stderr
|
||||||
|
root = tmp_path / "infospaces" / "cli-archive"
|
||||||
|
|
||||||
|
source = tmp_path / "src.md"
|
||||||
|
source.write_text("# src\n", encoding="utf-8")
|
||||||
|
add = run_cli(
|
||||||
|
"add-artifact", str(root), str(source), "--kind", "source", "--title", "Src",
|
||||||
|
)
|
||||||
|
assert add.returncode == 0, add.stderr
|
||||||
|
|
||||||
|
archive = run_cli("archive", str(root), "--note", "via cli")
|
||||||
|
assert archive.returncode == 0, archive.stderr
|
||||||
|
record = json.loads(archive.stdout)
|
||||||
|
assert record["note"] == "via cli"
|
||||||
|
assert record["manifest_digest"].startswith("blake3:")
|
||||||
|
|
||||||
|
listing = run_cli("archive-list", str(root))
|
||||||
|
assert listing.returncode == 0, listing.stderr
|
||||||
|
assert json.loads(listing.stdout)["archives"][0]["package_id"] == record["package_id"]
|
||||||
|
|
||||||
|
listing_with_retention = run_cli(
|
||||||
|
"archive-list", str(root), "--with-retention",
|
||||||
|
)
|
||||||
|
assert listing_with_retention.returncode == 0, listing_with_retention.stderr
|
||||||
|
annotated = json.loads(listing_with_retention.stdout)["archives"]
|
||||||
|
assert annotated[0]["retention"]["effective_class"] == "release-evidence"
|
||||||
|
|
||||||
|
target = tmp_path / "restored"
|
||||||
|
restore = run_cli(
|
||||||
|
"restore",
|
||||||
|
record["package_id"],
|
||||||
|
"--target",
|
||||||
|
str(target),
|
||||||
|
"--from",
|
||||||
|
str(root),
|
||||||
|
)
|
||||||
|
assert restore.returncode == 0, restore.stderr
|
||||||
|
result = json.loads(restore.stdout)
|
||||||
|
assert result["file_count"] == record["file_count"]
|
||||||
|
assert (target / "infospace.yaml").is_file()
|
||||||
|
|||||||
@@ -4,7 +4,7 @@ type: workplan
|
|||||||
title: "Infospace Archive Integration With artifact-store"
|
title: "Infospace Archive Integration With artifact-store"
|
||||||
domain: markitect
|
domain: markitect
|
||||||
repo: infospace-bench
|
repo: infospace-bench
|
||||||
status: in_progress
|
status: done
|
||||||
owner: markitect
|
owner: markitect
|
||||||
topic_slug: markitect
|
topic_slug: markitect
|
||||||
created: "2026-05-14"
|
created: "2026-05-14"
|
||||||
|
|||||||
Reference in New Issue
Block a user