generated from coulomb/repo-seed
docs: charter meta-framework vision, research, and SAND-WP-0002
Rewrite INTENT.md as the sand-boxer meta-framework charter (OpenRouter-style sandbox API, extensions, payments, Coulomb sibling boundaries). Add research under research/, update SCOPE.md, bootstrap workplans SAND-WP-0001/0002, and State Hub integration files from the bootstrap pass.
This commit is contained in:
204
research/02-reference-frameworks.md
Normal file
204
research/02-reference-frameworks.md
Normal file
@@ -0,0 +1,204 @@
|
||||
# Reference frameworks and platforms
|
||||
|
||||
Deep dives on systems sand-boxer should learn from — especially OpenClaw,
|
||||
Hermes Agent, Blitzy, and OpenShell — plus hosted platforms as extension
|
||||
targets.
|
||||
|
||||
---
|
||||
|
||||
## OpenClaw
|
||||
|
||||
**What it is:** Personal AI assistant with optional tool sandboxing.
|
||||
**Docs:** https://docs.openclaw.ai/gateway/sandboxing
|
||||
|
||||
### Role in the stack
|
||||
|
||||
OpenClaw is an **agent harness** (gateway, channels, skills, memory). Sandboxing
|
||||
is optional configuration on tool execution — not the product core. This is the
|
||||
same boundary sand-boxer draws vs **glas-harness**.
|
||||
|
||||
### Sandbox architecture
|
||||
|
||||
**What gets sandboxed:** `exec`, `read`, `write`, `edit`, `apply_patch`,
|
||||
`process`, optional sandboxed browser. Gateway stays on host.
|
||||
|
||||
**Backends:**
|
||||
|
||||
| Backend | Where | Workspace model |
|
||||
|---------|-------|-----------------|
|
||||
| `docker` | Local container | Bind-mount or copy; default `network: "none"` |
|
||||
| `ssh` | Remote SSH host | Remote-canonical: seed once, exec remotely |
|
||||
| `openshell` | OpenShell-managed | `mirror` (local canonical) or `remote` (remote canonical) |
|
||||
|
||||
**Scope:** `agent` (default) | `session` | `shared` — controls container count.
|
||||
|
||||
**Mode:** `off` | `non-main` | `all` — when sandboxing applies.
|
||||
|
||||
**Workspace access:** `none` | `ro` | `rw` — what tools can see.
|
||||
|
||||
### Security patterns worth copying
|
||||
|
||||
- Default Docker network **none**
|
||||
- Bind-mount blocklist: `docker.sock`, `/etc`, `~/.ssh`, `~/.aws`, credential roots
|
||||
- Symlink-aware path validation before bind approval
|
||||
- `tools.elevated` as explicit sandbox bypass (audited escape hatch)
|
||||
- Honest disclaimer: reduces blast radius, not perfect boundary
|
||||
|
||||
### sand-boxer lessons
|
||||
|
||||
1. **Backend / scope / workspaceAccess** vocabulary is proven — adopt in profile schema
|
||||
2. **SSH remote-canonical** matches Custodian e2e-framework evolution path
|
||||
3. **mirror vs remote** workspace modes belong in meta-framework API
|
||||
4. OpenClaw integrates OpenShell as extension — validates extension-delegation model
|
||||
|
||||
---
|
||||
|
||||
## Hermes Agent
|
||||
|
||||
**What it is:** Agent harness from Nous Research with multi-backend terminal execution.
|
||||
**Repo:** https://github.com/NousResearch/hermes-agent
|
||||
|
||||
### Terminal backends (six)
|
||||
|
||||
| Backend | Isolation | Persistence |
|
||||
|---------|-----------|-------------|
|
||||
| `local` | None | — |
|
||||
| `docker` | Cap-drop ALL, pids-limit, tmpfs | Single long-lived labeled container |
|
||||
| `ssh` | Network boundary | Persistent remote shell |
|
||||
| `modal` | Cloud VM | Filesystem snapshots |
|
||||
| `daytona` | Cloud container | Stop/resume |
|
||||
| `singularity` | HPC namespaces | Writable overlay |
|
||||
|
||||
### Docker backend highlights
|
||||
|
||||
- **One container per task**, reused across sessions and Hermes process restarts
|
||||
- Labels: `hermes-agent=1`, `hermes-task-id`, `hermes-profile`
|
||||
- `docker_persist_across_processes: true` (default) — container survives process exit
|
||||
- Resource limits: CPU, memory, disk, `lifetime_seconds` idle reaper
|
||||
- `docker_forward_env` — secrets from host `.env`, not config YAML
|
||||
- Parallel subagents **share** container unless per-task image override
|
||||
|
||||
### sand-boxer lessons
|
||||
|
||||
1. **Labeled reuse** beats cold provision per tool call for agent coding efficiency
|
||||
2. Resource limits and idle reaper are profile-level concerns
|
||||
3. Modal/Daytona as **extension backends** — Hermes consumes, does not own
|
||||
4. Credential forwarding policy belongs in extension contract, not agent config
|
||||
|
||||
---
|
||||
|
||||
## NVIDIA OpenShell + NemoClaw (Hermes deployment)
|
||||
|
||||
**OpenShell:** Policy runtime for agent sandboxes — Landlock, seccomp, OPA egress.
|
||||
**NemoClaw:** Reference stack deploying Hermes inside OpenShell.
|
||||
|
||||
### Three-layer model (industry pattern)
|
||||
|
||||
| Layer | Component | Responsibility |
|
||||
|-------|-----------|----------------|
|
||||
| Model | LLM provider | Reasoning |
|
||||
| Harness | Hermes | Skills, memory, bridges, scheduling |
|
||||
| Runtime | OpenShell | Filesystem/network policy, credential brokering |
|
||||
|
||||
sand-boxer maps to **runtime** only. glas-harness maps to **harness**.
|
||||
|
||||
### Policy model
|
||||
|
||||
Declarative YAML: allowed hosts, ports, HTTP methods, **binary-scoped** rules
|
||||
(e.g. only `curl` may reach `api.github.com`). Credentials injected at egress
|
||||
proxy — agent never sees Slack/Outlook tokens.
|
||||
|
||||
### Snapshot / restore
|
||||
|
||||
NemoClaw ships `snapshot.sh` / `restore.sh` for agent state (skills, memories,
|
||||
sessions) across redeploys. Credential filter excludes secrets from tarballs.
|
||||
|
||||
### Security research (Lasso, Apr 2026)
|
||||
|
||||
Demonstrated exfiltration via **policy-permitted** paths (git PR, npm postinstall
|
||||
→ Discord). Policies enforced correctly; intent not evaluated.
|
||||
|
||||
**sand-boxer lesson:** OpenShell-class extensions should be offered; security
|
||||
runbooks must state limits of egress allowlisting.
|
||||
|
||||
---
|
||||
|
||||
## Blitzy
|
||||
|
||||
**What it is:** AI-native code generation platform — **not** a sandbox runtime.
|
||||
|
||||
### "Blitzy Sandbox" GitHub org
|
||||
|
||||
Public demo repos for Explore members. Not execution infrastructure.
|
||||
|
||||
### Real isolation model: Environments
|
||||
|
||||
https://docs.blitzy.com/administration/environments
|
||||
|
||||
- Natural-language **setup instructions** (toolchain, build, run, test)
|
||||
- **Variables** (plaintext) vs **Secrets** (encrypted, masked, **never sent to AI**)
|
||||
- Multi-environment priority merge (base + project override)
|
||||
- Validation in configured environment after code generation
|
||||
|
||||
### sand-boxer lessons (environment metadata, not runtime)
|
||||
|
||||
| Blitzy pattern | sand-boxer mapping |
|
||||
|----------------|-------------------|
|
||||
| Environment config | Profile `setup` metadata block |
|
||||
| Secrets never to AI | `secret_refs` resolved at provision boundary |
|
||||
| Setup instructions | Profile runbook for extension bootstrap |
|
||||
| Human review gates | Out of scope — **snuggle-inventor** / PR workflow |
|
||||
|
||||
Blitzy validates that **describing how to boot an environment** is as important
|
||||
as **where it runs**. sand-boxer profiles carry both.
|
||||
|
||||
---
|
||||
|
||||
## Hosted platforms as extension targets
|
||||
|
||||
sand-boxer extensions may delegate to SaaS providers. Initial extension candidates:
|
||||
|
||||
| Extension id | Provider | Self-host alt | Payments |
|
||||
|--------------|----------|---------------|----------|
|
||||
| `ext.e2b` | E2B | — | Per-second SaaS |
|
||||
| `ext.modal` | Modal | — | Per-second + GPU |
|
||||
| `ext.daytona` | Daytona cloud | `ext.daytona-self` (OSS) | SaaS or infra cost |
|
||||
| `ext.openshell` | — | OpenShell local/k3s | Infra cost |
|
||||
| `ext.compose-ssh` | — | sandboxer01 / CoulombCore | Infra cost |
|
||||
| `ext.vm-packer` | — | build-machines lineage | Infra cost |
|
||||
|
||||
ComputeSDK (https://github.com/computesdk/computesdk) is a useful reference for
|
||||
normalizing provider differences behind one client API.
|
||||
|
||||
---
|
||||
|
||||
## OpenRouter analogy
|
||||
|
||||
| OpenRouter | sand-boxer |
|
||||
|------------|------------|
|
||||
| Unified LLM API | Unified sandbox API |
|
||||
| Routes to OpenAI, Anthropic, … | Routes to E2B, Modal, self-hosted compose, … |
|
||||
| API keys / credits / billing | Payments layer for SaaS consumption |
|
||||
| Model metadata (context, price) | Profile metadata (isolation, cost, latency) |
|
||||
| Fallback / routing policy | Host placement + extension fallback |
|
||||
|
||||
sand-boxer does not run inference; it runs **isolation**. The routing and
|
||||
payments patterns transfer directly.
|
||||
|
||||
---
|
||||
|
||||
## Anti-patterns to avoid
|
||||
|
||||
| Anti-pattern | Why |
|
||||
|--------------|-----|
|
||||
| Rebuild OpenClaw/Hermes gateway in sand-boxer | glas-harness scope |
|
||||
| Embed e2e test orchestration in provisioner | wise-validator scope |
|
||||
| Generate code inside sandbox API | snuggle-inventor scope |
|
||||
| Own SSH tunnels or CA | ops-bridge / ops-warden scope |
|
||||
| Claim sandbox = safe from prompt injection | Research disproves |
|
||||
|
||||
## Related reading
|
||||
|
||||
- [01-agent-sandbox-landscape.md](01-agent-sandbox-landscape.md)
|
||||
- [03-meta-framework-synthesis.md](03-meta-framework-synthesis.md)
|
||||
- `INTENT.md` — normative charter
|
||||
Reference in New Issue
Block a user