Files

tegwick 952cebf2e9 feat: snapshot/restore checkpoints (SAND-WP-0007)

Add workspace checkpoint API with SnapshotStore, extension hooks on
compose-ssh and saas-stub, manager orchestration, CLI/HTTP surface,
profile.compose-checkpoint, and docs/tests.

2026-06-24 07:57:40 +02:00

6.4 KiB

Raw Blame History

sand-boxer meta-framework specification

Version 0.1 — derived from research/03-meta-framework-synthesis.md and INTENT.md.

sand-boxer is the sandbox establishment service: one API for consumers, many extension backends. It provisions where and how code runs; sibling projects own agent harnessing, validation, and code generation.

Resource model

Resource	Description
Profile	Named, versioned sandbox recipe: extension binding, isolation, network, TTL, placement
Extension	Backend adapter implementing provision / wait_ready / teardown
Host	Registered placement target for self-hosted extensions; read-only telemetry via `profile.sandbox-canary` (see `docs/host-telemetry.md`)
Sandbox	Running instance of a profile
Snapshot	Point-in-time workspace checkpoint (`sandboxer snapshot` / `restore`)
Route	Extension selection policy when multiple backends qualify
Meter	Usage record for payments layer (SaaS extensions — SAND-WP-0006)

Lifecycle states

requested → provisioning → ready → active → { expired | failed } → destroying → destroyed

State	Meaning
`requested`	Create accepted; not yet handed to extension
`provisioning`	Extension running provision + wait_ready
`ready`	Reachability confirmed; consumer may attach
`active`	Consumer has marked sandbox in use (optional transition)
`expired`	TTL elapsed before explicit destroy
`failed`	Provision or readiness failed
`destroying`	Teardown in progress
`destroyed`	Resources released; record retained for audit

State Hub event mapping

Each transition emits a State Hub progress event (or dedicated registration API when available):

Transition	`event_type`	Required fields
→ `requested`	`note`	`sandbox_id`, `profile_id`, `consumer`
→ `provisioning`	`note`	`extension_id`, `host`
→ `ready`	`milestone`	`reachability` descriptor
→ `active`	`note`	`actor_type`, timestamps
→ `failed`	`note`	`error` summary
→ `destroying`	`note`	—
→ `destroyed`	`milestone`	`duration_s`, cleanup report

Event detail payload (JSON):

{
  "sandbox_id": "abc12345",
  "profile_id": "profile.compose-e2e",
  "extension_id": "ext.compose-ssh",
  "host": "coulombcore",
  "consumer": {"actor": "atm", "project": "wise-validator", "run_id": "..."},
  "actor_type": "atm",
  "state": "ready",
  "reachability": {"ssh": "root@coulombcore", "remote_dir": "/tmp/sandboxer/abc12345"},
  "timestamps": {"created_at": "...", "ready_at": "..."}
}

Extends the build-agent self-register pattern: generic sandbox identities carry profile_id + extension_id instead of build-machine metadata.

Core API operations (v0)

Operation	Description	v0 scope
`create`	Provision from profile + inputs	Yes
`get`	Inspect sandbox status	Yes
`list`	List sandboxes (filter by consumer optional)	Yes
`extend_ttl`	Extend time-to-live	Stub
`recreate`	Destroy and reprovision from stored seed	Yes
`destroy`	Idempotent teardown	Yes
`snapshot` / `restore`	Checkpoint workspace	Yes (compose-ssh, saas-stub)
`exec`	Run command in sandbox	Harness-owned via SSH (glas-harness)

HTTP surface (optional v0; CLI calls core library directly):

POST /v1/sandboxes — create
GET /v1/sandboxes/{id} — get
GET /v1/sandboxes — list
DELETE /v1/sandboxes/{id} — destroy
POST /v1/sandboxes/{id}/snapshot — checkpoint
POST /v1/snapshots/{id}/restore — restore
GET /v1/snapshots — list checkpoints

Consumer attribution

Every create request carries a consumer block:

consumer:
  actor: adm | agt | atm
  project: <calling-repo-or-service>   # e.g. wise-validator, glas-harness
  session_id: <optional>
  run_id: <optional>

Actor	Typical caller
`adm`	Human operator via CLI
`agt`	LLM agent session
`atm`	Deterministic automation (CI, activity-core, wise-validator)

sand-boxer records attribution on every lifecycle event. It does not interpret agent intent or authorize the caller — flex-auth owns authorization when enforced.

Extension interface

Each extension implements:

provision(profile, inputs, placement) → SandboxHandle
wait_ready(handle) → Reachability
teardown(handle) → CleanupReport
estimate_cost?(profile, duration) → MeterQuote   # optional; SaaS only

Registration requirements (validated at load time):

id — unique extension identifier (ext.<name>)
capabilities — isolation levels, regions, persistence, pricing model
handler — Python entry point or built-in registry binding

Extensions are discovered from extensions/*.yaml at repo root and loaded via sandboxer.extensions.registry.

Routing policy vocabulary

When multiple extensions satisfy a profile capability:

Strategy	Behavior
`prefer-self-hosted`	Self-hosted extensions first; SaaS fallback (default Coulomb posture)
`lowest-cost`	Cheapest `estimate_cost` quote wins
`lowest-latency`	Closest region / host wins
`explicit`	Profile names a single `extension`; no auto-routing

Routing engine v0: sandboxer.routing.resolver — see docs/routing.md.

Security limits

sand-boxer commits to:

Default-deny network unless profile explicitly allows egress
Secrets at provision boundary — secret_refs resolved via ops-warden / OpenBao; never returned to agent context
Blast-radius isolation — dedicated hosts (sandboxer01, CoulombCore) away from Railiance01 production
Observable lifecycle — every transition attributed to adm / agt / atm
Honest limits — allowed tool paths can be abused by compromised agents

sand-boxer does not provide intent-aware egress filtering in v1.

Out of scope (sibling ownership)

Concern	Owner
Agent gateway, tools, memory	glas-harness
e2e.yml semantics, health checks, test pass/fail	wise-validator
Code generation, setup instructions content	snuggle-inventor
SSH tunnels	ops-bridge
SSH certificates	ops-warden
Workstream / task state	state-hub

See docs/integrations/ for per-sibling contracts.

6.4 KiB Raw Blame History