Add meta-framework spec, pydantic schemas, profile/extension YAML, extension registry, ext.compose-ssh backend, SandboxManager with State Hub events, CLI commands, integration docs, capability registry entry, and compose-e2e runbook. Nine unit tests pass. T10 remote smoke test remains for operator.
6.2 KiB
sand-boxer meta-framework specification
Version 0.1 — derived from research/03-meta-framework-synthesis.md and INTENT.md.
sand-boxer is the sandbox establishment service: one API for consumers, many extension backends. It provisions where and how code runs; sibling projects own agent harnessing, validation, and code generation.
Resource model
| Resource | Description |
|---|---|
| Profile | Named, versioned sandbox recipe: extension binding, isolation, network, TTL, placement |
| Extension | Backend adapter implementing provision / wait_ready / teardown |
| Host | Registered placement target for self-hosted extensions |
| Sandbox | Running instance of a profile |
| Snapshot | Point-in-time workspace checkpoint (deferred — SAND-WP-0003) |
| Route | Extension selection policy when multiple backends qualify |
| Meter | Usage record for payments layer (SaaS extensions — SAND-WP-0006) |
Lifecycle states
requested → provisioning → ready → active → { expired | failed } → destroying → destroyed
| State | Meaning |
|---|---|
requested |
Create accepted; not yet handed to extension |
provisioning |
Extension running provision + wait_ready |
ready |
Reachability confirmed; consumer may attach |
active |
Consumer has marked sandbox in use (optional transition) |
expired |
TTL elapsed before explicit destroy |
failed |
Provision or readiness failed |
destroying |
Teardown in progress |
destroyed |
Resources released; record retained for audit |
State Hub event mapping
Each transition emits a State Hub progress event (or dedicated registration API when available):
| Transition | event_type |
Required fields |
|---|---|---|
→ requested |
note |
sandbox_id, profile_id, consumer |
→ provisioning |
note |
extension_id, host |
→ ready |
milestone |
reachability descriptor |
→ active |
note |
actor_type, timestamps |
→ failed |
note |
error summary |
→ destroying |
note |
— |
→ destroyed |
milestone |
duration_s, cleanup report |
Event detail payload (JSON):
{
"sandbox_id": "abc12345",
"profile_id": "profile.compose-e2e",
"extension_id": "ext.compose-ssh",
"host": "coulombcore",
"consumer": {"actor": "atm", "project": "wise-validator", "run_id": "..."},
"actor_type": "atm",
"state": "ready",
"reachability": {"ssh": "root@coulombcore", "remote_dir": "/tmp/sandboxer/abc12345"},
"timestamps": {"created_at": "...", "ready_at": "..."}
}
Extends the build-agent self-register pattern: generic sandbox identities carry
profile_id + extension_id instead of build-machine metadata.
Core API operations (v0)
| Operation | Description | v0 scope |
|---|---|---|
create |
Provision from profile + inputs | Yes |
get |
Inspect sandbox status | Yes |
list |
List sandboxes (filter by consumer optional) | Yes |
extend_ttl |
Extend time-to-live | Stub |
recreate |
Destroy and reprovision from stored seed | Yes |
destroy |
Idempotent teardown | Yes |
snapshot / restore |
Checkpoint workspace | Deferred (SAND-WP-0003) |
exec |
Run command in sandbox | Harness-owned via SSH (glas-harness) |
HTTP surface (optional v0; CLI calls core library directly):
POST /v1/sandboxes— createGET /v1/sandboxes/{id}— getGET /v1/sandboxes— listDELETE /v1/sandboxes/{id}— destroy
Consumer attribution
Every create request carries a consumer block:
consumer:
actor: adm | agt | atm
project: <calling-repo-or-service> # e.g. wise-validator, glas-harness
session_id: <optional>
run_id: <optional>
| Actor | Typical caller |
|---|---|
adm |
Human operator via CLI |
agt |
LLM agent session |
atm |
Deterministic automation (CI, activity-core, wise-validator) |
sand-boxer records attribution on every lifecycle event. It does not interpret agent intent or authorize the caller — flex-auth owns authorization when enforced.
Extension interface
Each extension implements:
provision(profile, inputs, placement) → SandboxHandle
wait_ready(handle) → Reachability
teardown(handle) → CleanupReport
estimate_cost?(profile, duration) → MeterQuote # optional; SaaS only
Registration requirements (validated at load time):
id— unique extension identifier (ext.<name>)capabilities— isolation levels, regions, persistence, pricing modelhandler— Python entry point or built-in registry binding
Extensions are discovered from extensions/*.yaml at repo root and loaded via
sandboxer.extensions.registry.
Routing policy vocabulary
When multiple extensions satisfy a profile capability:
| Strategy | Behavior |
|---|---|
prefer-self-hosted |
Self-hosted extensions first; SaaS fallback (default Coulomb posture) |
lowest-cost |
Cheapest estimate_cost quote wins |
lowest-latency |
Closest region / host wins |
explicit |
Profile names a single extension; no auto-routing |
v0 resolves profile.extension directly — routing engine deferred to SAND-WP-0006.
Security limits
sand-boxer commits to:
- Default-deny network unless profile explicitly allows egress
- Secrets at provision boundary —
secret_refsresolved via ops-warden / OpenBao; never returned to agent context - Blast-radius isolation — dedicated hosts (sandboxer01, CoulombCore) away from Railiance01 production
- Observable lifecycle — every transition attributed to
adm/agt/atm - Honest limits — allowed tool paths can be abused by compromised agents
sand-boxer does not provide intent-aware egress filtering in v1.
Out of scope (sibling ownership)
| Concern | Owner |
|---|---|
| Agent gateway, tools, memory | glas-harness |
| e2e.yml semantics, health checks, test pass/fail | wise-validator |
| Code generation, setup instructions content | snuggle-inventor |
| SSH tunnels | ops-bridge |
| SSH certificates | ops-warden |
| Workstream / task state | state-hub |
See docs/integrations/ for per-sibling contracts.