# Agent sandbox landscape (2026) Survey of modern sandbox infrastructure for agentic coding — isolation technologies, provider models, and industry convergence patterns relevant to sand-boxer. ## Market definition **AI agent sandboxes** are isolated execution environments for running AI-generated or agent-requested code safely. They optimize for: - Fast create / resume / teardown - Programmatic lifecycle APIs - Isolation from host and peer workloads - Developer- and agent-friendly SDKs This is distinct from general application hosting and from agent harnesses (memory, channels, tool orchestration). ## Provider landscape (summary) | Platform | Model | Creation | Persist / checkpoint | Isolation | Notes | |----------|-------|----------|----------------------|-----------|-------| | **E2B** | Managed SaaS | ~150ms | Pause/resume, snapshots | Firecracker | Scale leader; template + sandbox API | | **Daytona** | Managed + OSS | ~90ms | Snapshots, fork | Docker/Kata | Open-source self-host path | | **Modal** | Serverless SaaS | Sub-second | Memory snapshots, volumes | gVisor | Strong GPU; code-defined runtime | | **Blaxel** | Managed | Sub-25ms resume | Hibernate | microVM | Zero idle compute billing | | **Vercel Sandbox** | Managed | ms | Snapshots, persistent default | Firecracker | Vercel ecosystem | | **Cloudflare Sandbox SDK** | Edge | seconds / ms (isolates) | DO state | Containers / V8 | Workers-native | | **AWS AgentCore** | Managed sessions | — | Session ≤8h | microVM | Hyperscaler bundling | | **Google Agent Sandbox** | Managed preview | Sub-second | TTL ≤14d | Hardened containers | Gemini Enterprise layer | | **OpenSandbox** | Self-hosted OSS | Pool pre-warm | Pause/resume, PVC | gVisor/Kata/Firecracker | K8s-scale; CNCF Landscape | | **OpenShell** | Policy runtime | — | Long-lived sandboxes | Landlock/seccomp/OPA | Governance layer, not hosted platform | | **Northflank** | BYOC + managed | ~200ms | Persistent | microVM/gVisor | VPC deployment | | **Runloop** | Managed | ~100ms exec | Snapshot, branch | Custom hypervisor | SWE-bench / eval focus | | **Sprites** | Managed | 1–2s | ~300ms checkpoints | Firecracker | Persistent-first | | **ComputeSDK** | Abstraction | Varies | Varies | Varies | Multi-provider router (9 backends) | Sources: [Ry Walker research (Jun 2026)](https://rywalker.com/research/ai-agent-sandboxes), provider docs, Modal/E2B marketing materials. Treat vendor claims as directional. ## Isolation technology spectrum | Technology | Used by | Security level | Performance | |------------|---------|----------------|-------------| | **Firecracker** | E2B, Sprites, Vercel | Hardware-level microVM | Fast | | **gVisor / Kata** | Modal, Northflank, OpenSandbox | Kernel-level | Very fast | | **Hardened Docker** | Daytona, AIO Sandbox | Container-level | Fastest setup | | **Landlock / seccomp / OPA** | OpenShell | Kernel policy | Native speed | | **V8 isolates** | Cloudflare Worker Loader | Process-level | Milliseconds | **Implication for sand-boxer:** profile metadata must declare `isolation_level` so consumers can reason about blast radius. Extensions map profiles to concrete runtimes; the meta-framework does not mandate one technology. ## Convergence trends (2025 → 2026) ### 1. Ephemeral vs persistent collapsed Early market split (E2B = ephemeral, Sprites = persistent) has merged. Most platforms now offer: - Persistent workspace by default or as first-class option - Checkpoint / snapshot / hibernate for fast resume - TTL and explicit teardown still expected for cost and security **sand-boxer takeaway:** profiles should support `persistence: ephemeral | persistent | checkpoint` as a first-class dimension, not a backend detail. ### 2. Checkpointing is table stakes Sub-second to low-second restore times are becoming baseline for agent coding (workspace state, installed deps, shell history — not always live PIDs). **sand-boxer takeaway:** lifecycle API needs `snapshot`, `restore`, `fork` operations even if early extensions only implement `recreate`. ### 3. Security stress-tests exposed limits Research on AWS AgentCore and OpenShell/NemoClaw showed that **allowed egress paths** (git, npm, curl, node to allowlisted hosts) can be weaponized for exfiltration when agents are prompt-injected or tricked into malicious dependencies. Policy controls *destination*, not *intent*. **sand-boxer takeaway:** document honestly that sandboxing is blast-radius control, not agent-behavior guarantee. Default-deny network; per-profile egress allowlists; secrets injected at boundary, never in agent-visible workspace. ### 4. Hyperscaler bundling pressures independents AWS, Google, Cloudflare, Vercel entered the category in one quarter. Independents compete on multi-cloud neutrality, price, isolation depth, or open-source self-host. **sand-boxer takeaway:** OpenRouter-style routing across self-hosted and SaaS backends is a defensible Coulomb position — no single-vendor lock-in. ### 5. Abstraction layers emerging ComputeSDK routes one TypeScript API across E2B, Modal, Daytona, Runloop, Cloudflare, Vercel, etc. — "Terraform for running other people's code." **sand-boxer takeaway:** validate the meta-framework API against this pattern; extensions are providers; sand-boxer core is router + policy + billing + registry. ## Architecture patterns (industry) ### Gateway / harness vs runtime (universal split) ``` [Agent gateway / harness] ──orchestrates──▶ [Sandbox runtime] (host or control plane) (isolated) ``` OpenClaw and Hermes both keep the gateway on the host and run **tool execution** in the sandbox. sand-boxer owns the runtime side only; **glas-harness** owns the gateway/harness side (see `03-meta-framework-synthesis.md`). ### Profile + backend + scope (OpenClaw / Hermes consensus) | Dimension | Examples | |-----------|----------| | **Backend** | docker, ssh, openshell, modal, daytona, compose-ssh | | **Scope** | per-agent, per-session, shared | | **Workspace** | isolated, ro-mount, rw-mount; mirror vs remote-canonical | | **Network** | default deny; optional allowlist | | **TTL** | mandatory; idle reaper optional | ### Credential and reachability boundary Best practice: credentials brokered at sandbox edge (OpenShell proxy, Blitzy secrets-never-to-AI, ops-warden certs). Agent process never holds production tokens for unrelated systems. sand-boxer integrates **ops-bridge** (reachability) and **ops-warden** (identity) as consumers — does not replace them. ## What sand-boxer should adopt vs defer | Adopt now (meta-framework) | Defer (extension or phase 2) | |----------------------------|------------------------------| | Unified provision/teardown API | GPU profiles | | Named versioned profiles | Browser sandbox profiles | | Extension plugin interface | Intent-aware egress filtering | | Self-hosted compose-ssh (e2e lineage) | Native Firecracker control plane | | State Hub lifecycle registration | Multi-region routing | | Default-deny network policy | Computer Use / desktop sandboxes | | Payments routing for SaaS backends | Owned hyperscale sandbox fleet | ## Related reading - [02-reference-frameworks.md](02-reference-frameworks.md) — OpenClaw, Hermes, Blitzy - [03-meta-framework-synthesis.md](03-meta-framework-synthesis.md) — API and extensions