--- domain: infotech repo: sand-boxer updated: "2026-06-22" --- # INTENT > sand-boxer is the Coulomb **meta-framework for establishing sandboxes** — a > unified API and extension platform for provisioning every variation of isolated > execution environment, from self-hosted compose stacks to metered SaaS > runtimes. This file is the charter: why it exists, what it owns, and where > sibling projects begin. Research backing this charter lives in `research/`. --- ## Why it exists Custodian automation is moving from **workstation-anchored** execution to **Railiance01-scheduled** orchestration. That shift improves reliability but does not, by itself, answer the harder question: **where can agentic and deterministic work run safely** without the laptop filesystem, sleep cycles, and single-user blast radius? The industry has exploded with sandbox answers — E2B, Modal, Daytona, OpenShell, OpenClaw-style Docker/SSH backends, hyperscaler interpreters — each with different APIs, billing models, and isolation postures. Coulomb needs **one place to establish sandboxes** regardless of backend, not a new integration per agent harness, validator, or codegen pipeline. sand-boxer exists to be that place: **OpenRouter for sandboxes, not for models.** Consumers call one API. Extensions delegate to the sandbox system that fits — self-hosted on sandboxer01, inherited compose-ssh from `the-custodian`, or a metered cloud provider. An integrated **payments layer** handles SaaS consumption when Coulomb uses external capacity. Over time, operational learning may justify a Coulomb-native **best-of-brands runtime** — but that is a later phase built on evidence, not day-one ambition. The workstation becomes optional for **runtime**. Railiance01 decides *when* work runs (via activity-core). sand-boxer decides *where* isolated execution happens. State Hub records *what* changed. --- ## The governing principle sand-boxer is the **sandbox establishment service** — profiles, provisioning, extension routing, placement, lifecycle, and metering. Nothing more. It answers: 1. **Which sandbox recipe applies?** Profile selection and version resolution. 2. **Which backend fulfills it?** Extension routing (self-hosted vs SaaS). 3. **Where does it run?** Host placement and blast-radius policy. 4. **How is isolation enforced?** Network default-deny, TTL, resource limits, teardown guarantees — as declared by profile + extension. 5. **How does it become reachable?** Consumer integration with ops-bridge and ops-warden — without owning tunnels or certificates. 6. **What happened?** Lifecycle events, usage meters, State Hub registration. 7. **What did it cost?** Payments and credits for metered extensions. It must **not** become the agent harness, the e2e validator, the code generator, the scheduler, the work-state database, the connectivity authority, or production hosting on Railiance01. --- ## The OpenRouter analogy | OpenRouter | sand-boxer | |------------|------------| | Unified LLM access API | Unified sandbox establishment API | | Routes across model providers | Routes across sandbox extensions | | Provider metadata (price, context) | Profile metadata (isolation, cost, latency) | | API keys, credits, usage billing | Payments layer for SaaS sandbox consumption | | BYOK supported | BYOK for extension provider keys | | Does not train models | Does not replace extension runtimes (until phase 5) | sand-boxer is **infrastructure routing**, not product UX. Harnesses, validators, and inventors are customers. --- ## Coulomb sibling boundaries sand-boxer stays inside the **sandboxing boundary**. Three sibling Coulomb projects own adjacent concerns. Integration is contractual — they **request** sandboxes; sand-boxer **establishes** them. ### glas-harness — agent harness **Owns:** Gateway, tool orchestration, skills, memory, channels, subagent delegation, session semantics, sandbox *consumption* from the agent's perspective. **Does not own:** Sandbox runtimes, profile catalog authority, host placement, extension adapters, isolation enforcement. glas-harness configures *when* tools run in a sandbox (OpenClaw-style `mode` / `scope` / `workspaceAccess`). sand-boxer provides the sandbox handle and reachability descriptor. ### wise-validator — e2e test and health **Owns:** Validation workflows, health check semantics, test orchestration, pass/fail interpretation, structured result reporting to State Hub and CI. **Does not own:** Remote host provisioning, compose lifecycle, port isolation, sandbox teardown. wise-validator replaces the validation half of `the-custodian/e2e-framework/`. It requests `profile.compose-e2e` (or successors), runs tests inside the established environment, and owns the `e2e.yml` contract. ### snuggle-inventor — code generation **Owns:** Code generation, modernization pipelines, tech-spec and planning artifacts, PR-oriented output, human-in-the-loop review gates. **Does not own:** Sandbox infrastructure, environment bootstrapping authority, secret stores, runtime metering. snuggle-inventor may attach Blitzy-style **setup instructions** and secret references as profile inputs. sand-boxer resolves secrets at the provision boundary; generated code never transits sand-boxer APIs. ### Boundary diagram ``` glas-harness wise-validator snuggle-inventor (agent harness) (e2e + health) (code generation) │ │ │ └─────────────────────┼──────────────────────┘ │ POST /v1/sandboxes ▼ sand-boxer (establish sandboxes) │ ┌───────────────┼───────────────┐ ▼ ▼ ▼ ext.compose-ssh ext.modal ext.e2b … (self-hosted) (SaaS+meter) (SaaS+meter) ``` ### Existing Custodian repos (unchanged) | Concern | Owner | |---------|--------| | Workstream, task, progress state | `state-hub` | | Cron and orchestration | `activity-core` | | SSH reverse tunnels | `ops-bridge` | | SSH certificate issuance | `ops-warden` | | Canon and agent instruction canon | `the-custodian` | | Capability federation hub | `reuse-surface` | | Production on Railiance01 | `railiance-apps` / domain repos | | ADR-001 reconciliation | `state-hub` | sand-boxer **consumes** ops-bridge and ops-warden; it does not subsume them. --- ## What it is sand-boxer is a **meta-framework** with four pillars: ### 1. Unified establishment API One consistent surface for all sandbox variations: - Create, inspect, extend, snapshot, recreate, destroy - Profile-driven inputs (repo ref, compose bundle, setup metadata, secret refs) - Consumer attribution (`adm` / `agt` / `atm` + calling project id) - Lifecycle states: `requested → provisioning → ready → active → expired → destroyed` Early versions may expose a subset; the API shape is designed for completeness. ### 2. Profile catalog Named, versioned recipes — not one-off containers: - Extension binding (`ext.compose-ssh`, `ext.vm-packer`, `ext.e2b`, …) - Isolation level, network policy, workspace mode (`mirror` | `remote-canonical`) - Scope default (`agent` | `session` | `shared`) - TTL, resource limits, placement preference - Setup metadata (natural-language bootstrap instructions for extensions) - Registered in `registry/` and federated via reuse-surface Profiles collect good ideas from OpenClaw (backend/scope/workspace), Hermes (labeled reuse, resource limits), Blitzy (setup instructions, secret boundary), and hosted platforms (checkpoint, persistence classes) into **one schema**. ### 3. Extension platform Extensions **delegate** to sandbox systems and services: | Class | Examples | Billing | |-------|----------|---------| | **Self-hosted** | compose-ssh, vm-packer, Daytona OSS, OpenShell | Infra allocation | | **SaaS consumption** | E2B, Modal, Daytona cloud, future providers | Payments layer | Each extension implements a provision / ready / teardown contract (optional snapshot / cost estimate). Extensions ship as plugins; third-party and Coulomb- native backends use the same interface. ### 4. Payments and metering For metered SaaS extensions: - Org/workspace credits and usage accounting - Pre-create cost estimates; post-destroy actuals - BYOK for provider API keys where supported - Export to domain billing systems — sand-boxer meters sandbox consumption, not general payments Self-hosted extensions record **allocation** (host, duration), not external spend. --- ## What it is not | Concern | Owner | sand-boxer role | |---------|--------|-----------------| | Agent gateway, tools, memory, channels | **glas-harness** | Customer API | | E2e tests, health checks, validation | **wise-validator** | Customer API | | Code generation, tech specs, AAP | **snuggle-inventor** | Customer API | | When work runs | `activity-core` | None | | What tasks exist | `state-hub` | Registers lifecycle only | | Tunnels | `ops-bridge` | Consumer | | Certs | `ops-warden` | Consumer | | Intent-aware egress / prompt security | Research frontier | Document limits only | sand-boxer provides **blast-radius isolation and governed reachability**. It does not protect against a compromised agent abusing **allowed** egress paths (git, npm, curl to allowlisted hosts). Security runbooks must state this explicitly. --- ## Strategic context ### Workstation automation is interim Local timers and laptop scripts bootstrapped ADR-001 sync. Railiance01 activity-core schedules are the direction. Workstation paths remain only where no sandbox alternative exists yet. ### Host topology | Layer | Role | |-------|------| | **Railiance01** | Production k3s, activity-core, Temporal — **not** agent dev runtime | | **sandboxer01** | Dedicated sandbox host — preferred blast-radius isolation | | **CoulombCore** | Interim sandbox host during migration | | **Workstation (WSL)** | Control-plane anchor today — **not** target execution surface | | **SaaS extensions** | Burst / capability gap (GPU, desktop) via payments layer | ### Lineage sand-boxer generalizes patterns split across `the-custodian`: | Legacy | sand-boxer | Sibling | |--------|------------|---------| | `e2e-framework/` provision/teardown | `ext.compose-ssh` | wise-validator owns test run | | `e2e-framework/` health + test + report | — | wise-validator | | `infra/build-machines/` | `ext.vm-packer` | — | | Agent sandbox config (future) | API consumer | glas-harness | `the-custodian` stays governance-focused; sand-boxer becomes the execution venue catalog. ### Phase 5: Coulomb-native runtime (later) After operating extensions in production — observing latency, cost, failure modes, isolation gaps — sand-boxer may ship an owned **best-of-brands** sandboxing solution combining: - Persistent labeled workspaces (Hermes pattern) - Default-deny policy layer (OpenShell lessons) - Fast resume / checkpoint (industry baseline) - Self-hosted economics (Daytona/OpenSandbox lessons) This is **not** v1 scope. Extensions and payments come first; native runtime follows evidence. --- ## Intended users - **Human operators (`adm`)** — profiles, hosts, extensions, credits, lifecycle - **LLM agents (`agt`)** — via glas-harness, snuggle-inventor, or direct API - **Deterministic automations (`atm`)** — via wise-validator, activity-core, CI - **Extension authors** — implement backend adapters against the extension contract - **Platform integrators** — register capabilities, federate via reuse-surface --- ## Design principles - **Meta-framework, not monolith** — one API; many extensions; optional native runtime later - **Profiles over one-offs** — every sandbox type is named, versioned, registered - **Prefer self-hosted** — SaaS via explicit routing policy, not silent default - **Blast-radius isolation** — dedicated hosts; never jeopardize Railiance01 production - **Reachability, not ownership** — ops-bridge + ops-warden as consumers - **Secrets at the boundary** — resolve at provision; never in agent-visible workspace - **Observable lifecycle** — every state transition attributable and queryable - **Disposable by default** — TTL-bound; persistence and checkpoint are explicit - **Honest security** — sandboxing limits blast radius; it is not intent enforcement - **Registry-first reuse** — capabilities in `registry/` before ad hoc duplication - **Payments transparency** — estimate before create; meter on destroy for SaaS --- ## Near-term outcomes 1. **Charter and research** — `INTENT.md`, `research/`, profile schema draft 2. **First self-hosted extension** — `ext.compose-ssh` from e2e-framework lineage 3. **Unified API v0** — create / get / destroy / recreate + State Hub registration 4. **First profile** — `profile.compose-e2e` for wise-validator migration 5. **Registry entry** — `capability.execution.sandbox-provision` via reuse-surface 6. **Extension SDK sketch** — contract for P1 backends (vm-packer, Daytona OSS) 7. **Sibling integration notes** — glas-harness, wise-validator, snuggle-inventor API expectations documented --- ## Maturity target A mature sand-boxer is Coulomb's **default way to establish any sandbox**: - glas-harness requests agent dev sandboxes without choosing Docker vs Modal vs SSH - wise-validator requests validation environments without owning provisioners - snuggle-inventor requests build sandboxes with setup metadata and secret refs - activity-core and CI request bounded venues with consistent lifecycle visibility - Operators route spend across self-hosted and SaaS with one credits model - A Coulomb-native runtime — if warranted — wins on ops data, not speculation The workstation is optional. The harness is not sand-boxer. The validator is not sand-boxer. The inventor is not sand-boxer. **Establishing the box is.**