generated from coulomb/repo-seed
Rewrite INTENT.md as the sand-boxer meta-framework charter (OpenRouter-style sandbox API, extensions, payments, Coulomb sibling boundaries). Add research under research/, update SCOPE.md, bootstrap workplans SAND-WP-0001/0002, and State Hub integration files from the bootstrap pass.
294 lines
9.6 KiB
Markdown
294 lines
9.6 KiB
Markdown
# Meta-framework synthesis
|
|
|
|
Design notes distilled from landscape research for sand-boxer's unified sandbox
|
|
API, extension model, payments layer, and Coulomb project boundaries.
|
|
|
|
---
|
|
|
|
## Core thesis
|
|
|
|
sand-boxer is a **meta-framework for establishing sandboxes** — like OpenRouter
|
|
is a meta-framework for accessing LLM models:
|
|
|
|
- One **consistent API** for consumers (`adm`, `agt`, `atm`, domain services)
|
|
- Many **extensions** that delegate to self-hosted or SaaS sandbox systems
|
|
- **Integrated payments** when consuming metered external services
|
|
- **Registry-first** profiles and capabilities via reuse-surface
|
|
- **Later:** a Coulomb-native "best of brands" runtime built from operational
|
|
experience — not day one
|
|
|
|
sand-boxer provisions **where and how code runs**. It does not provision **how
|
|
agents think**, **what tests mean**, or **what code gets written**.
|
|
|
|
---
|
|
|
|
## Coulomb project boundaries
|
|
|
|
These sibling projects are **planned Coulomb repos** with explicit authority
|
|
split. sand-boxer must not absorb their concerns.
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
subgraph establish [sand-boxer]
|
|
SB[Establish sandbox]
|
|
end
|
|
|
|
subgraph harness [glas-harness]
|
|
GH[Agent harness: gateway tools memory channels]
|
|
end
|
|
|
|
subgraph validate [wise-validator]
|
|
WV[E2E tests health checks validation orchestration]
|
|
end
|
|
|
|
subgraph generate [snuggle-inventor]
|
|
SI[Code generation modernization]
|
|
end
|
|
|
|
GH -->|request sandbox| SB
|
|
WV -->|request sandbox| SB
|
|
SI -->|request sandbox| SB
|
|
WV -.->|runs tests in| SB
|
|
GH -.->|executes tools in| SB
|
|
SI -.->|validates output in| SB
|
|
```
|
|
|
|
| Project | Owns | Does not own |
|
|
|---------|------|--------------|
|
|
| **sand-boxer** | Sandbox profiles, provision/teardown, extension routing, placement, lifecycle registration, payments for sandbox consumption | Agent memory, channels, tool policies, test definitions, code generation |
|
|
| **glas-harness** | Agent gateway, harness, skills, subagents, tool orchestration, channel bridges | Sandbox runtime, isolation enforcement, host placement |
|
|
| **wise-validator** | E2E test orchestration, health check semantics, validation workflows, result reporting | Sandbox provisioning, agent conversation state |
|
|
| **snuggle-inventor** | Code generation, tech specs, AAP-style planning, PR-oriented output | Sandbox infrastructure, test harness canon |
|
|
|
|
### Integration contracts (intended)
|
|
|
|
**glas-harness → sand-boxer**
|
|
|
|
```
|
|
POST /v1/sandboxes
|
|
profile: "profile.agent-dev"
|
|
scope: session | agent | shared
|
|
workspace: { mode: mirror | remote, access: none | ro | rw }
|
|
consumer: { actor: agt, harness: glas-harness, session_id }
|
|
```
|
|
|
|
Harness receives: `sandbox_id`, reachability descriptor (SSH endpoint, tunnel ref),
|
|
lifecycle webhook or poll URL. Harness executes tools **inside** sandbox via
|
|
agreed exec channel — sand-boxer does not parse tool calls.
|
|
|
|
**wise-validator → sand-boxer**
|
|
|
|
```
|
|
POST /v1/sandboxes
|
|
profile: "profile.compose-e2e"
|
|
inputs: { repo_ref, compose_bundle_ref }
|
|
ttl: 2h
|
|
consumer: { actor: atm, harness: wise-validator, run_id }
|
|
```
|
|
|
|
wise-validator owns `e2e.yml` semantics, health check definitions, test commands,
|
|
and pass/fail interpretation. sand-boxer delivers an environment; wise-validator
|
|
runs the validation story **on top**.
|
|
|
|
**snuggle-inventor → sand-boxer**
|
|
|
|
```
|
|
POST /v1/sandboxes
|
|
profile: "profile.build"
|
|
setup_metadata: { instructions_ref, secret_refs }
|
|
consumer: { actor: agt, harness: snuggle-inventor, job_id }
|
|
```
|
|
|
|
snuggle-inventor may attach Blitzy-style setup instructions as profile inputs.
|
|
sand-boxer resolves secrets at boundary; generated code never flows through
|
|
sand-boxer APIs.
|
|
|
|
### Migration from the-custodian
|
|
|
|
| Legacy | New owner |
|
|
|--------|-----------|
|
|
| `e2e-framework/` provision/teardown | sand-boxer `ext.compose-ssh` |
|
|
| `e2e-framework/` test run + report | wise-validator (calls sand-boxer) |
|
|
| Agent tool sandbox config | glas-harness (calls sand-boxer) |
|
|
| `infra/build-machines/` | sand-boxer `ext.vm-packer` |
|
|
|
|
---
|
|
|
|
## Meta-framework API (conceptual)
|
|
|
|
### Resources
|
|
|
|
| Resource | Description |
|
|
|----------|-------------|
|
|
| `Profile` | Named, versioned sandbox recipe (image, isolation, network, TTL, extension) |
|
|
| `Extension` | Backend adapter (self-hosted or SaaS) |
|
|
| `Host` | Registered placement target for self-hosted extensions |
|
|
| `Sandbox` | Running instance of a profile |
|
|
| `Snapshot` | Point-in-time workspace checkpoint (optional) |
|
|
| `Route` | Extension selection policy (cost, latency, capability) |
|
|
| `Meter` | Usage record for payments layer |
|
|
|
|
### Sandbox lifecycle states
|
|
|
|
```
|
|
requested → provisioning → ready → active → { expired | failed } → destroying → destroyed
|
|
```
|
|
|
|
All transitions emit State Hub events. `ready` means reachability probe succeeded.
|
|
|
|
### Core operations
|
|
|
|
| Operation | Description |
|
|
|-----------|-------------|
|
|
| `create` | Provision from profile + inputs |
|
|
| `get` / `list` | Inspect status |
|
|
| `exec` | Run command in sandbox (optional — may be harness-owned) |
|
|
| `extend_ttl` | Explicit persistence extension |
|
|
| `snapshot` / `restore` | Checkpoint workspace |
|
|
| `recreate` | Destroy and reprovision from seed |
|
|
| `destroy` | Idempotent teardown |
|
|
|
|
Early versions may expose only `create`, `get`, `destroy`, `recreate`; harnesses
|
|
can own `exec` via SSH/tunnel without sand-boxer proxying every command.
|
|
|
|
### Profile schema (minimum)
|
|
|
|
```yaml
|
|
id: profile.compose-e2e
|
|
version: "1.0.0"
|
|
extension: ext.compose-ssh
|
|
isolation:
|
|
level: container # container | microvm | policy
|
|
network:
|
|
default: deny
|
|
egress: [] # extension interprets
|
|
workspace:
|
|
mode: remote-canonical # mirror | remote-canonical
|
|
access: rw
|
|
scope_default: session
|
|
ttl:
|
|
default: 4h
|
|
max: 24h
|
|
idle_reap: null
|
|
resources:
|
|
cpu: null
|
|
memory_mb: null
|
|
setup:
|
|
instructions: "" # Blitzy-style natural language for extension bootstrap
|
|
secret_refs: [] # resolved at provision; never in agent context
|
|
placement:
|
|
prefer: [sandboxer01]
|
|
fallback: [coulombcore]
|
|
reachability:
|
|
tunnel: ops-bridge
|
|
identity: ops-warden
|
|
metadata:
|
|
cost_class: self-hosted # self-hosted | saas-metered
|
|
latency_class: standard
|
|
```
|
|
|
|
### Extension interface (contract)
|
|
|
|
Each extension implements:
|
|
|
|
```text
|
|
provision(profile, inputs, placement) → sandbox_handle
|
|
wait_ready(sandbox_handle) → reachability
|
|
teardown(sandbox_handle) → cleanup_report
|
|
snapshot?(sandbox_handle) → snapshot_id
|
|
restore?(snapshot_id) → sandbox_handle
|
|
estimate_cost?(profile, duration) → meter_quote
|
|
```
|
|
|
|
Extensions register in `registry/` with capability vectors (isolation level,
|
|
regions, GPU, persistence, pricing model).
|
|
|
|
**Bundled extensions (roadmap):**
|
|
|
|
| Priority | Extension | Type |
|
|
|----------|-----------|------|
|
|
| P0 | `ext.compose-ssh` | Self-hosted (e2e-framework lineage) |
|
|
| P1 | `ext.vm-packer` | Self-hosted (build-machines lineage) |
|
|
| P2 | `ext.daytona-self` | Self-hosted OSS |
|
|
| P3 | `ext.e2b`, `ext.modal`, `ext.daytona` | SaaS + payments |
|
|
| P4 | `ext.openshell` | Policy runtime wrapper |
|
|
|
|
---
|
|
|
|
## Payments layer
|
|
|
|
For SaaS extensions, sand-boxer provides an **integrated payments and metering
|
|
layer** analogous to OpenRouter credits:
|
|
|
|
| Concern | sand-boxer approach |
|
|
|---------|---------------------|
|
|
| Account credits | Org/workspace balance for sandbox consumption |
|
|
| Metering | Per-second, per-creation, GPU surcharge — per extension quote |
|
|
| Provider keys | BYOK optional; platform keys for convenience |
|
|
| Cost visibility | `estimate_cost` before create; actuals on destroy |
|
|
| Billing events | Export to fin-hub / external billing (consumer, not owner) |
|
|
|
|
Self-hosted extensions bill **infra cost only** (host allocation) — no SaaS meter.
|
|
|
|
Payments is a **facility inside sand-boxer**, not a general payment processor.
|
|
Domain billing authority remains elsewhere.
|
|
|
|
---
|
|
|
|
## Routing policy (OpenRouter-style)
|
|
|
|
When multiple extensions satisfy a profile capability:
|
|
|
|
```yaml
|
|
route:
|
|
strategy: prefer-self-hosted | lowest-cost | lowest-latency | explicit
|
|
fallback: [ext.compose-ssh, ext.daytona]
|
|
constraints:
|
|
max_cost_per_hour: null
|
|
require_isolation: microvm
|
|
region: eu
|
|
```
|
|
|
|
Default Coulomb posture: **prefer-self-hosted** on sandboxer01; SaaS for burst
|
|
or capability gaps (GPU, desktop) once extensions exist.
|
|
|
|
---
|
|
|
|
## Security posture (documented limits)
|
|
|
|
sand-boxer commits to:
|
|
|
|
1. Default-deny network unless profile explicitly allows egress
|
|
2. Secrets resolved at provision boundary via ops-warden / secret refs
|
|
3. Blast-radius isolation on dedicated hosts away from Railiance01 production
|
|
4. Observable lifecycle and attributable actors (`adm` / `agt` / `atm`)
|
|
5. Honest documentation: **allowed tool paths can be abused by compromised agents**
|
|
|
|
sand-boxer does **not** commit to intent-aware egress filtering in v1.
|
|
|
|
---
|
|
|
|
## Phased maturity
|
|
|
|
| Phase | Deliverable |
|
|
|-------|-------------|
|
|
| **0** | Charter, research, profile schema, `ext.compose-ssh` design |
|
|
| **1** | Unified API + self-hosted compose-ssh + State Hub registration |
|
|
| **2** | Extension SDK + vm-packer + registry entries + routing |
|
|
| **3** | SaaS extensions + payments layer |
|
|
| **4** | Snapshot/restore + checkpoint profiles |
|
|
| **5** | Coulomb-native runtime ("best of brands") informed by extension ops data |
|
|
|
|
Phase 5 is explicitly **later** — learn from routing, billing, failure modes, and
|
|
latency before building owned microVM/control-plane.
|
|
|
|
---
|
|
|
|
## Open questions (for workplans)
|
|
|
|
1. Does `exec` live in sand-boxer API or only in glas-harness via SSH?
|
|
2. Payments: integrate with existing fin-hub or standalone credits first?
|
|
3. Profile authorship: repo-local YAML vs hub-managed catalog?
|
|
4. wise-validator: fork e2e-framework reporter or new contract from day one?
|
|
|
|
These belong in SAND-WP-0002+ design workplans, not INTENT.md. |