Files

tegwick f33cff5363 docs: charter meta-framework vision, research, and SAND-WP-0002

Rewrite INTENT.md as the sand-boxer meta-framework charter (OpenRouter-style
sandbox API, extensions, payments, Coulomb sibling boundaries). Add research
under research/, update SCOPE.md, bootstrap workplans SAND-WP-0001/0002, and
State Hub integration files from the bootstrap pass.

2026-06-22 21:32:32 +02:00

7.5 KiB

Raw Blame History

Reference frameworks and platforms

Deep dives on systems sand-boxer should learn from — especially OpenClaw, Hermes Agent, Blitzy, and OpenShell — plus hosted platforms as extension targets.

OpenClaw

What it is: Personal AI assistant with optional tool sandboxing. Docs: https://docs.openclaw.ai/gateway/sandboxing

Role in the stack

OpenClaw is an agent harness (gateway, channels, skills, memory). Sandboxing is optional configuration on tool execution — not the product core. This is the same boundary sand-boxer draws vs glas-harness.

Sandbox architecture

What gets sandboxed: exec, read, write, edit, apply_patch, process, optional sandboxed browser. Gateway stays on host.

Backends:

Backend	Where	Workspace model
`docker`	Local container	Bind-mount or copy; default `network: "none"`
`ssh`	Remote SSH host	Remote-canonical: seed once, exec remotely
`openshell`	OpenShell-managed	`mirror` (local canonical) or `remote` (remote canonical)

Scope: agent (default) | session | shared — controls container count.

Mode: off | non-main | all — when sandboxing applies.

Workspace access: none | ro | rw — what tools can see.

Security patterns worth copying

Default Docker network none
Bind-mount blocklist: docker.sock, /etc, ~/.ssh, ~/.aws, credential roots
Symlink-aware path validation before bind approval
tools.elevated as explicit sandbox bypass (audited escape hatch)
Honest disclaimer: reduces blast radius, not perfect boundary

sand-boxer lessons

Backend / scope / workspaceAccess vocabulary is proven — adopt in profile schema
SSH remote-canonical matches Custodian e2e-framework evolution path
mirror vs remote workspace modes belong in meta-framework API
OpenClaw integrates OpenShell as extension — validates extension-delegation model

Hermes Agent

What it is: Agent harness from Nous Research with multi-backend terminal execution. Repo: https://github.com/NousResearch/hermes-agent

Terminal backends (six)

Backend	Isolation	Persistence
`local`	None	—
`docker`	Cap-drop ALL, pids-limit, tmpfs	Single long-lived labeled container
`ssh`	Network boundary	Persistent remote shell
`modal`	Cloud VM	Filesystem snapshots
`daytona`	Cloud container	Stop/resume
`singularity`	HPC namespaces	Writable overlay

Docker backend highlights

One container per task, reused across sessions and Hermes process restarts
Labels: hermes-agent=1, hermes-task-id, hermes-profile
docker_persist_across_processes: true (default) — container survives process exit
Resource limits: CPU, memory, disk, lifetime_seconds idle reaper
docker_forward_env — secrets from host .env, not config YAML
Parallel subagents share container unless per-task image override

sand-boxer lessons

Labeled reuse beats cold provision per tool call for agent coding efficiency
Resource limits and idle reaper are profile-level concerns
Modal/Daytona as extension backends — Hermes consumes, does not own
Credential forwarding policy belongs in extension contract, not agent config

NVIDIA OpenShell + NemoClaw (Hermes deployment)

OpenShell: Policy runtime for agent sandboxes — Landlock, seccomp, OPA egress. NemoClaw: Reference stack deploying Hermes inside OpenShell.

Three-layer model (industry pattern)

Layer	Component	Responsibility
Model	LLM provider	Reasoning
Harness	Hermes	Skills, memory, bridges, scheduling
Runtime	OpenShell	Filesystem/network policy, credential brokering

sand-boxer maps to runtime only. glas-harness maps to harness.

Policy model

Declarative YAML: allowed hosts, ports, HTTP methods, binary-scoped rules (e.g. only curl may reach api.github.com). Credentials injected at egress proxy — agent never sees Slack/Outlook tokens.

Snapshot / restore

NemoClaw ships snapshot.sh / restore.sh for agent state (skills, memories, sessions) across redeploys. Credential filter excludes secrets from tarballs.

Security research (Lasso, Apr 2026)

Demonstrated exfiltration via policy-permitted paths (git PR, npm postinstall → Discord). Policies enforced correctly; intent not evaluated.

sand-boxer lesson: OpenShell-class extensions should be offered; security runbooks must state limits of egress allowlisting.

Blitzy

What it is: AI-native code generation platform — not a sandbox runtime.

"Blitzy Sandbox" GitHub org

Public demo repos for Explore members. Not execution infrastructure.

Real isolation model: Environments

https://docs.blitzy.com/administration/environments

Natural-language setup instructions (toolchain, build, run, test)
Variables (plaintext) vs Secrets (encrypted, masked, never sent to AI)
Multi-environment priority merge (base + project override)
Validation in configured environment after code generation

sand-boxer lessons (environment metadata, not runtime)

Blitzy pattern	sand-boxer mapping
Environment config	Profile `setup` metadata block
Secrets never to AI	`secret_refs` resolved at provision boundary
Setup instructions	Profile runbook for extension bootstrap
Human review gates	Out of scope — snuggle-inventor / PR workflow

Blitzy validates that describing how to boot an environment is as important as where it runs. sand-boxer profiles carry both.

Hosted platforms as extension targets

sand-boxer extensions may delegate to SaaS providers. Initial extension candidates:

Extension id	Provider	Self-host alt	Payments
`ext.e2b`	E2B	—	Per-second SaaS
`ext.modal`	Modal	—	Per-second + GPU
`ext.daytona`	Daytona cloud	`ext.daytona-self` (OSS)	SaaS or infra cost
`ext.openshell`	—	OpenShell local/k3s	Infra cost
`ext.compose-ssh`	—	sandboxer01 / CoulombCore	Infra cost
`ext.vm-packer`	—	build-machines lineage	Infra cost

ComputeSDK (https://github.com/computesdk/computesdk) is a useful reference for normalizing provider differences behind one client API.

OpenRouter analogy

OpenRouter	sand-boxer
Unified LLM API	Unified sandbox API
Routes to OpenAI, Anthropic, …	Routes to E2B, Modal, self-hosted compose, …
API keys / credits / billing	Payments layer for SaaS consumption
Model metadata (context, price)	Profile metadata (isolation, cost, latency)
Fallback / routing policy	Host placement + extension fallback

sand-boxer does not run inference; it runs isolation. The routing and payments patterns transfer directly.

Anti-patterns to avoid

Anti-pattern	Why
Rebuild OpenClaw/Hermes gateway in sand-boxer	glas-harness scope
Embed e2e test orchestration in provisioner	wise-validator scope
Generate code inside sandbox API	snuggle-inventor scope
Own SSH tunnels or CA	ops-bridge / ops-warden scope
Claim sandbox = safe from prompt injection	Research disproves

7.5 KiB Raw Blame History

Reference frameworks and platforms

OpenClaw

Role in the stack

Sandbox architecture

Security patterns worth copying

sand-boxer lessons

Hermes Agent

Terminal backends (six)

Docker backend highlights

sand-boxer lessons

NVIDIA OpenShell + NemoClaw (Hermes deployment)

Three-layer model (industry pattern)

Policy model

Snapshot / restore

Security research (Lasso, Apr 2026)

Blitzy

"Blitzy Sandbox" GitHub org

Real isolation model: Environments

sand-boxer lessons (environment metadata, not runtime)

Hosted platforms as extension targets

OpenRouter analogy

Anti-patterns to avoid

Related reading

7.5 KiB

Raw Blame History