Rewrite INTENT.md as the sand-boxer meta-framework charter (OpenRouter-style sandbox API, extensions, payments, Coulomb sibling boundaries). Add research under research/, update SCOPE.md, bootstrap workplans SAND-WP-0001/0002, and State Hub integration files from the bootstrap pass.
7.5 KiB
Reference frameworks and platforms
Deep dives on systems sand-boxer should learn from — especially OpenClaw, Hermes Agent, Blitzy, and OpenShell — plus hosted platforms as extension targets.
OpenClaw
What it is: Personal AI assistant with optional tool sandboxing. Docs: https://docs.openclaw.ai/gateway/sandboxing
Role in the stack
OpenClaw is an agent harness (gateway, channels, skills, memory). Sandboxing is optional configuration on tool execution — not the product core. This is the same boundary sand-boxer draws vs glas-harness.
Sandbox architecture
What gets sandboxed: exec, read, write, edit, apply_patch,
process, optional sandboxed browser. Gateway stays on host.
Backends:
| Backend | Where | Workspace model |
|---|---|---|
docker |
Local container | Bind-mount or copy; default network: "none" |
ssh |
Remote SSH host | Remote-canonical: seed once, exec remotely |
openshell |
OpenShell-managed | mirror (local canonical) or remote (remote canonical) |
Scope: agent (default) | session | shared — controls container count.
Mode: off | non-main | all — when sandboxing applies.
Workspace access: none | ro | rw — what tools can see.
Security patterns worth copying
- Default Docker network none
- Bind-mount blocklist:
docker.sock,/etc,~/.ssh,~/.aws, credential roots - Symlink-aware path validation before bind approval
tools.elevatedas explicit sandbox bypass (audited escape hatch)- Honest disclaimer: reduces blast radius, not perfect boundary
sand-boxer lessons
- Backend / scope / workspaceAccess vocabulary is proven — adopt in profile schema
- SSH remote-canonical matches Custodian e2e-framework evolution path
- mirror vs remote workspace modes belong in meta-framework API
- OpenClaw integrates OpenShell as extension — validates extension-delegation model
Hermes Agent
What it is: Agent harness from Nous Research with multi-backend terminal execution. Repo: https://github.com/NousResearch/hermes-agent
Terminal backends (six)
| Backend | Isolation | Persistence |
|---|---|---|
local |
None | — |
docker |
Cap-drop ALL, pids-limit, tmpfs | Single long-lived labeled container |
ssh |
Network boundary | Persistent remote shell |
modal |
Cloud VM | Filesystem snapshots |
daytona |
Cloud container | Stop/resume |
singularity |
HPC namespaces | Writable overlay |
Docker backend highlights
- One container per task, reused across sessions and Hermes process restarts
- Labels:
hermes-agent=1,hermes-task-id,hermes-profile docker_persist_across_processes: true(default) — container survives process exit- Resource limits: CPU, memory, disk,
lifetime_secondsidle reaper docker_forward_env— secrets from host.env, not config YAML- Parallel subagents share container unless per-task image override
sand-boxer lessons
- Labeled reuse beats cold provision per tool call for agent coding efficiency
- Resource limits and idle reaper are profile-level concerns
- Modal/Daytona as extension backends — Hermes consumes, does not own
- Credential forwarding policy belongs in extension contract, not agent config
NVIDIA OpenShell + NemoClaw (Hermes deployment)
OpenShell: Policy runtime for agent sandboxes — Landlock, seccomp, OPA egress. NemoClaw: Reference stack deploying Hermes inside OpenShell.
Three-layer model (industry pattern)
| Layer | Component | Responsibility |
|---|---|---|
| Model | LLM provider | Reasoning |
| Harness | Hermes | Skills, memory, bridges, scheduling |
| Runtime | OpenShell | Filesystem/network policy, credential brokering |
sand-boxer maps to runtime only. glas-harness maps to harness.
Policy model
Declarative YAML: allowed hosts, ports, HTTP methods, binary-scoped rules
(e.g. only curl may reach api.github.com). Credentials injected at egress
proxy — agent never sees Slack/Outlook tokens.
Snapshot / restore
NemoClaw ships snapshot.sh / restore.sh for agent state (skills, memories,
sessions) across redeploys. Credential filter excludes secrets from tarballs.
Security research (Lasso, Apr 2026)
Demonstrated exfiltration via policy-permitted paths (git PR, npm postinstall → Discord). Policies enforced correctly; intent not evaluated.
sand-boxer lesson: OpenShell-class extensions should be offered; security runbooks must state limits of egress allowlisting.
Blitzy
What it is: AI-native code generation platform — not a sandbox runtime.
"Blitzy Sandbox" GitHub org
Public demo repos for Explore members. Not execution infrastructure.
Real isolation model: Environments
https://docs.blitzy.com/administration/environments
- Natural-language setup instructions (toolchain, build, run, test)
- Variables (plaintext) vs Secrets (encrypted, masked, never sent to AI)
- Multi-environment priority merge (base + project override)
- Validation in configured environment after code generation
sand-boxer lessons (environment metadata, not runtime)
| Blitzy pattern | sand-boxer mapping |
|---|---|
| Environment config | Profile setup metadata block |
| Secrets never to AI | secret_refs resolved at provision boundary |
| Setup instructions | Profile runbook for extension bootstrap |
| Human review gates | Out of scope — snuggle-inventor / PR workflow |
Blitzy validates that describing how to boot an environment is as important as where it runs. sand-boxer profiles carry both.
Hosted platforms as extension targets
sand-boxer extensions may delegate to SaaS providers. Initial extension candidates:
| Extension id | Provider | Self-host alt | Payments |
|---|---|---|---|
ext.e2b |
E2B | — | Per-second SaaS |
ext.modal |
Modal | — | Per-second + GPU |
ext.daytona |
Daytona cloud | ext.daytona-self (OSS) |
SaaS or infra cost |
ext.openshell |
— | OpenShell local/k3s | Infra cost |
ext.compose-ssh |
— | sandboxer01 / CoulombCore | Infra cost |
ext.vm-packer |
— | build-machines lineage | Infra cost |
ComputeSDK (https://github.com/computesdk/computesdk) is a useful reference for normalizing provider differences behind one client API.
OpenRouter analogy
| OpenRouter | sand-boxer |
|---|---|
| Unified LLM API | Unified sandbox API |
| Routes to OpenAI, Anthropic, … | Routes to E2B, Modal, self-hosted compose, … |
| API keys / credits / billing | Payments layer for SaaS consumption |
| Model metadata (context, price) | Profile metadata (isolation, cost, latency) |
| Fallback / routing policy | Host placement + extension fallback |
sand-boxer does not run inference; it runs isolation. The routing and payments patterns transfer directly.
Anti-patterns to avoid
| Anti-pattern | Why |
|---|---|
| Rebuild OpenClaw/Hermes gateway in sand-boxer | glas-harness scope |
| Embed e2e test orchestration in provisioner | wise-validator scope |
| Generate code inside sandbox API | snuggle-inventor scope |
| Own SSH tunnels or CA | ops-bridge / ops-warden scope |
| Claim sandbox = safe from prompt injection | Research disproves |
Related reading
- 01-agent-sandbox-landscape.md
- 03-meta-framework-synthesis.md
INTENT.md— normative charter