generated from coulomb/repo-seed
Rewrite INTENT.md as the sand-boxer meta-framework charter (OpenRouter-style sandbox API, extensions, payments, Coulomb sibling boundaries). Add research under research/, update SCOPE.md, bootstrap workplans SAND-WP-0001/0002, and State Hub integration files from the bootstrap pass.
338 lines
14 KiB
Markdown
338 lines
14 KiB
Markdown
---
|
|
domain: infotech
|
|
repo: sand-boxer
|
|
updated: "2026-06-22"
|
|
---
|
|
|
|
# INTENT
|
|
|
|
> sand-boxer is the Coulomb **meta-framework for establishing sandboxes** — a
|
|
> unified API and extension platform for provisioning every variation of isolated
|
|
> execution environment, from self-hosted compose stacks to metered SaaS
|
|
> runtimes. This file is the charter: why it exists, what it owns, and where
|
|
> sibling projects begin.
|
|
|
|
Research backing this charter lives in `research/`.
|
|
|
|
---
|
|
|
|
## Why it exists
|
|
|
|
Custodian automation is moving from **workstation-anchored** execution to
|
|
**Railiance01-scheduled** orchestration. That shift improves reliability but does
|
|
not, by itself, answer the harder question: **where can agentic and deterministic
|
|
work run safely** without the laptop filesystem, sleep cycles, and single-user
|
|
blast radius?
|
|
|
|
The industry has exploded with sandbox answers — E2B, Modal, Daytona, OpenShell,
|
|
OpenClaw-style Docker/SSH backends, hyperscaler interpreters — each with
|
|
different APIs, billing models, and isolation postures. Coulomb needs **one place
|
|
to establish sandboxes** regardless of backend, not a new integration per agent
|
|
harness, validator, or codegen pipeline.
|
|
|
|
sand-boxer exists to be that place: **OpenRouter for sandboxes, not for models.**
|
|
|
|
Consumers call one API. Extensions delegate to the sandbox system that fits —
|
|
self-hosted on sandboxer01, inherited compose-ssh from `the-custodian`, or a
|
|
metered cloud provider. An integrated **payments layer** handles SaaS consumption
|
|
when Coulomb uses external capacity. Over time, operational learning may justify
|
|
a Coulomb-native **best-of-brands runtime** — but that is a later phase built on
|
|
evidence, not day-one ambition.
|
|
|
|
The workstation becomes optional for **runtime**. Railiance01 decides *when*
|
|
work runs (via activity-core). sand-boxer decides *where* isolated execution
|
|
happens. State Hub records *what* changed.
|
|
|
|
---
|
|
|
|
## The governing principle
|
|
|
|
sand-boxer is the **sandbox establishment service** — profiles, provisioning,
|
|
extension routing, placement, lifecycle, and metering. Nothing more.
|
|
|
|
It answers:
|
|
|
|
1. **Which sandbox recipe applies?** Profile selection and version resolution.
|
|
2. **Which backend fulfills it?** Extension routing (self-hosted vs SaaS).
|
|
3. **Where does it run?** Host placement and blast-radius policy.
|
|
4. **How is isolation enforced?** Network default-deny, TTL, resource limits,
|
|
teardown guarantees — as declared by profile + extension.
|
|
5. **How does it become reachable?** Consumer integration with ops-bridge and
|
|
ops-warden — without owning tunnels or certificates.
|
|
6. **What happened?** Lifecycle events, usage meters, State Hub registration.
|
|
7. **What did it cost?** Payments and credits for metered extensions.
|
|
|
|
It must **not** become the agent harness, the e2e validator, the code generator,
|
|
the scheduler, the work-state database, the connectivity authority, or production
|
|
hosting on Railiance01.
|
|
|
|
---
|
|
|
|
## The OpenRouter analogy
|
|
|
|
| OpenRouter | sand-boxer |
|
|
|------------|------------|
|
|
| Unified LLM access API | Unified sandbox establishment API |
|
|
| Routes across model providers | Routes across sandbox extensions |
|
|
| Provider metadata (price, context) | Profile metadata (isolation, cost, latency) |
|
|
| API keys, credits, usage billing | Payments layer for SaaS sandbox consumption |
|
|
| BYOK supported | BYOK for extension provider keys |
|
|
| Does not train models | Does not replace extension runtimes (until phase 5) |
|
|
|
|
sand-boxer is **infrastructure routing**, not product UX. Harnesses, validators,
|
|
and inventors are customers.
|
|
|
|
---
|
|
|
|
## Coulomb sibling boundaries
|
|
|
|
sand-boxer stays inside the **sandboxing boundary**. Three sibling Coulomb
|
|
projects own adjacent concerns. Integration is contractual — they **request**
|
|
sandboxes; sand-boxer **establishes** them.
|
|
|
|
### glas-harness — agent harness
|
|
|
|
**Owns:** Gateway, tool orchestration, skills, memory, channels, subagent
|
|
delegation, session semantics, sandbox *consumption* from the agent's perspective.
|
|
|
|
**Does not own:** Sandbox runtimes, profile catalog authority, host placement,
|
|
extension adapters, isolation enforcement.
|
|
|
|
glas-harness configures *when* tools run in a sandbox (OpenClaw-style
|
|
`mode` / `scope` / `workspaceAccess`). sand-boxer provides the sandbox handle
|
|
and reachability descriptor.
|
|
|
|
### wise-validator — e2e test and health
|
|
|
|
**Owns:** Validation workflows, health check semantics, test orchestration,
|
|
pass/fail interpretation, structured result reporting to State Hub and CI.
|
|
|
|
**Does not own:** Remote host provisioning, compose lifecycle, port isolation,
|
|
sandbox teardown.
|
|
|
|
wise-validator replaces the validation half of `the-custodian/e2e-framework/`.
|
|
It requests `profile.compose-e2e` (or successors), runs tests inside the
|
|
established environment, and owns the `e2e.yml` contract.
|
|
|
|
### snuggle-inventor — code generation
|
|
|
|
**Owns:** Code generation, modernization pipelines, tech-spec and planning
|
|
artifacts, PR-oriented output, human-in-the-loop review gates.
|
|
|
|
**Does not own:** Sandbox infrastructure, environment bootstrapping authority,
|
|
secret stores, runtime metering.
|
|
|
|
snuggle-inventor may attach Blitzy-style **setup instructions** and secret
|
|
references as profile inputs. sand-boxer resolves secrets at the provision
|
|
boundary; generated code never transits sand-boxer APIs.
|
|
|
|
### Boundary diagram
|
|
|
|
```
|
|
glas-harness wise-validator snuggle-inventor
|
|
(agent harness) (e2e + health) (code generation)
|
|
│ │ │
|
|
└─────────────────────┼──────────────────────┘
|
|
│ POST /v1/sandboxes
|
|
▼
|
|
sand-boxer
|
|
(establish sandboxes)
|
|
│
|
|
┌───────────────┼───────────────┐
|
|
▼ ▼ ▼
|
|
ext.compose-ssh ext.modal ext.e2b …
|
|
(self-hosted) (SaaS+meter) (SaaS+meter)
|
|
```
|
|
|
|
### Existing Custodian repos (unchanged)
|
|
|
|
| Concern | Owner |
|
|
|---------|--------|
|
|
| Workstream, task, progress state | `state-hub` |
|
|
| Cron and orchestration | `activity-core` |
|
|
| SSH reverse tunnels | `ops-bridge` |
|
|
| SSH certificate issuance | `ops-warden` |
|
|
| Canon and agent instruction canon | `the-custodian` |
|
|
| Capability federation hub | `reuse-surface` |
|
|
| Production on Railiance01 | `railiance-apps` / domain repos |
|
|
| ADR-001 reconciliation | `state-hub` |
|
|
|
|
sand-boxer **consumes** ops-bridge and ops-warden; it does not subsume them.
|
|
|
|
---
|
|
|
|
## What it is
|
|
|
|
sand-boxer is a **meta-framework** with four pillars:
|
|
|
|
### 1. Unified establishment API
|
|
|
|
One consistent surface for all sandbox variations:
|
|
|
|
- Create, inspect, extend, snapshot, recreate, destroy
|
|
- Profile-driven inputs (repo ref, compose bundle, setup metadata, secret refs)
|
|
- Consumer attribution (`adm` / `agt` / `atm` + calling project id)
|
|
- Lifecycle states: `requested → provisioning → ready → active → expired → destroyed`
|
|
|
|
Early versions may expose a subset; the API shape is designed for completeness.
|
|
|
|
### 2. Profile catalog
|
|
|
|
Named, versioned recipes — not one-off containers:
|
|
|
|
- Extension binding (`ext.compose-ssh`, `ext.vm-packer`, `ext.e2b`, …)
|
|
- Isolation level, network policy, workspace mode (`mirror` | `remote-canonical`)
|
|
- Scope default (`agent` | `session` | `shared`)
|
|
- TTL, resource limits, placement preference
|
|
- Setup metadata (natural-language bootstrap instructions for extensions)
|
|
- Registered in `registry/` and federated via reuse-surface
|
|
|
|
Profiles collect good ideas from OpenClaw (backend/scope/workspace), Hermes
|
|
(labeled reuse, resource limits), Blitzy (setup instructions, secret boundary),
|
|
and hosted platforms (checkpoint, persistence classes) into **one schema**.
|
|
|
|
### 3. Extension platform
|
|
|
|
Extensions **delegate** to sandbox systems and services:
|
|
|
|
| Class | Examples | Billing |
|
|
|-------|----------|---------|
|
|
| **Self-hosted** | compose-ssh, vm-packer, Daytona OSS, OpenShell | Infra allocation |
|
|
| **SaaS consumption** | E2B, Modal, Daytona cloud, future providers | Payments layer |
|
|
|
|
Each extension implements a provision / ready / teardown contract (optional
|
|
snapshot / cost estimate). Extensions ship as plugins; third-party and Coulomb-
|
|
native backends use the same interface.
|
|
|
|
### 4. Payments and metering
|
|
|
|
For metered SaaS extensions:
|
|
|
|
- Org/workspace credits and usage accounting
|
|
- Pre-create cost estimates; post-destroy actuals
|
|
- BYOK for provider API keys where supported
|
|
- Export to domain billing systems — sand-boxer meters sandbox consumption,
|
|
not general payments
|
|
|
|
Self-hosted extensions record **allocation** (host, duration), not external spend.
|
|
|
|
---
|
|
|
|
## What it is not
|
|
|
|
| Concern | Owner | sand-boxer role |
|
|
|---------|--------|-----------------|
|
|
| Agent gateway, tools, memory, channels | **glas-harness** | Customer API |
|
|
| E2e tests, health checks, validation | **wise-validator** | Customer API |
|
|
| Code generation, tech specs, AAP | **snuggle-inventor** | Customer API |
|
|
| When work runs | `activity-core` | None |
|
|
| What tasks exist | `state-hub` | Registers lifecycle only |
|
|
| Tunnels | `ops-bridge` | Consumer |
|
|
| Certs | `ops-warden` | Consumer |
|
|
| Intent-aware egress / prompt security | Research frontier | Document limits only |
|
|
|
|
sand-boxer provides **blast-radius isolation and governed reachability**. It does
|
|
not protect against a compromised agent abusing **allowed** egress paths (git,
|
|
npm, curl to allowlisted hosts). Security runbooks must state this explicitly.
|
|
|
|
---
|
|
|
|
## Strategic context
|
|
|
|
### Workstation automation is interim
|
|
|
|
Local timers and laptop scripts bootstrapped ADR-001 sync. Railiance01
|
|
activity-core schedules are the direction. Workstation paths remain only where no
|
|
sandbox alternative exists yet.
|
|
|
|
### Host topology
|
|
|
|
| Layer | Role |
|
|
|-------|------|
|
|
| **Railiance01** | Production k3s, activity-core, Temporal — **not** agent dev runtime |
|
|
| **sandboxer01** | Dedicated sandbox host — preferred blast-radius isolation |
|
|
| **CoulombCore** | Interim sandbox host during migration |
|
|
| **Workstation (WSL)** | Control-plane anchor today — **not** target execution surface |
|
|
| **SaaS extensions** | Burst / capability gap (GPU, desktop) via payments layer |
|
|
|
|
### Lineage
|
|
|
|
sand-boxer generalizes patterns split across `the-custodian`:
|
|
|
|
| Legacy | sand-boxer | Sibling |
|
|
|--------|------------|---------|
|
|
| `e2e-framework/` provision/teardown | `ext.compose-ssh` | wise-validator owns test run |
|
|
| `e2e-framework/` health + test + report | — | wise-validator |
|
|
| `infra/build-machines/` | `ext.vm-packer` | — |
|
|
| Agent sandbox config (future) | API consumer | glas-harness |
|
|
|
|
`the-custodian` stays governance-focused; sand-boxer becomes the execution
|
|
venue catalog.
|
|
|
|
### Phase 5: Coulomb-native runtime (later)
|
|
|
|
After operating extensions in production — observing latency, cost, failure
|
|
modes, isolation gaps — sand-boxer may ship an owned **best-of-brands**
|
|
sandboxing solution combining:
|
|
|
|
- Persistent labeled workspaces (Hermes pattern)
|
|
- Default-deny policy layer (OpenShell lessons)
|
|
- Fast resume / checkpoint (industry baseline)
|
|
- Self-hosted economics (Daytona/OpenSandbox lessons)
|
|
|
|
This is **not** v1 scope. Extensions and payments come first; native runtime
|
|
follows evidence.
|
|
|
|
---
|
|
|
|
## Intended users
|
|
|
|
- **Human operators (`adm`)** — profiles, hosts, extensions, credits, lifecycle
|
|
- **LLM agents (`agt`)** — via glas-harness, snuggle-inventor, or direct API
|
|
- **Deterministic automations (`atm`)** — via wise-validator, activity-core, CI
|
|
- **Extension authors** — implement backend adapters against the extension contract
|
|
- **Platform integrators** — register capabilities, federate via reuse-surface
|
|
|
|
---
|
|
|
|
## Design principles
|
|
|
|
- **Meta-framework, not monolith** — one API; many extensions; optional native runtime later
|
|
- **Profiles over one-offs** — every sandbox type is named, versioned, registered
|
|
- **Prefer self-hosted** — SaaS via explicit routing policy, not silent default
|
|
- **Blast-radius isolation** — dedicated hosts; never jeopardize Railiance01 production
|
|
- **Reachability, not ownership** — ops-bridge + ops-warden as consumers
|
|
- **Secrets at the boundary** — resolve at provision; never in agent-visible workspace
|
|
- **Observable lifecycle** — every state transition attributable and queryable
|
|
- **Disposable by default** — TTL-bound; persistence and checkpoint are explicit
|
|
- **Honest security** — sandboxing limits blast radius; it is not intent enforcement
|
|
- **Registry-first reuse** — capabilities in `registry/` before ad hoc duplication
|
|
- **Payments transparency** — estimate before create; meter on destroy for SaaS
|
|
|
|
---
|
|
|
|
## Near-term outcomes
|
|
|
|
1. **Charter and research** — `INTENT.md`, `research/`, profile schema draft
|
|
2. **First self-hosted extension** — `ext.compose-ssh` from e2e-framework lineage
|
|
3. **Unified API v0** — create / get / destroy / recreate + State Hub registration
|
|
4. **First profile** — `profile.compose-e2e` for wise-validator migration
|
|
5. **Registry entry** — `capability.execution.sandbox-provision` via reuse-surface
|
|
6. **Extension SDK sketch** — contract for P1 backends (vm-packer, Daytona OSS)
|
|
7. **Sibling integration notes** — glas-harness, wise-validator, snuggle-inventor API expectations documented
|
|
|
|
---
|
|
|
|
## Maturity target
|
|
|
|
A mature sand-boxer is Coulomb's **default way to establish any sandbox**:
|
|
|
|
- glas-harness requests agent dev sandboxes without choosing Docker vs Modal vs SSH
|
|
- wise-validator requests validation environments without owning provisioners
|
|
- snuggle-inventor requests build sandboxes with setup metadata and secret refs
|
|
- activity-core and CI request bounded venues with consistent lifecycle visibility
|
|
- Operators route spend across self-hosted and SaaS with one credits model
|
|
- A Coulomb-native runtime — if warranted — wins on ops data, not speculation
|
|
|
|
The workstation is optional. The harness is not sand-boxer. The validator is not
|
|
sand-boxer. The inventor is not sand-boxer. **Establishing the box is.** |