Files
sand-boxer/SCOPE.md
tegwick 20e25726d7 Add SAND-WP-0008: host telemetry and self-canary introspection
Workplan for default sand-boxer self-deploy, before/after host metrics,
stale sandbox inventory, and telemetry export for centralized analysis.
2026-06-23 14:25:05 +02:00

9.5 KiB

domain, repo, updated
domain repo updated
infotech sand-boxer 2026-06-22

SCOPE

This file helps you quickly understand what this repository is about, when it is relevant, and when it is not. It is intentionally lightweight and may be incomplete until implementation lands.


One-liner

Sandbox provisioning and profile catalog for Custodian — isolated execution environments where agents and automations can develop, build, and test without depending on the workstation filesystem or blast radius.


Core Idea

sand-boxer is the execution isolation and provisioning service for agentic development and related workloads in the Custodian ecosystem. It answers where work can run safely, how isolation is enforced, how sandboxes phone home, and what happened during their lifecycle.

A sandbox profile is a named, versioned recipe (compose stack, VM image, future cluster worker) with documented inputs, outputs, host placement, TTL, and teardown guarantees. Operators and agents request a profile; sand-boxer provisions an isolated environment on a registered host, exposes reachability through ops-bridge (without owning tunnels), registers lifecycle state with State Hub, and tears down on expiry or explicit release.

The repo consolidates patterns today split across the-custodian: e2e-framework/ (SSH + compose sandboxes for cross-repo e2e) and infra/build-machines/ (Packer VMs with build-agent self-registration).


In Scope

  • Sandbox profile catalog — versioned definitions for compose-based e2e stacks, VM images, and future worker patterns; inputs, outputs, and teardown contracts documented per profile
  • Provision / wait / teardown lifecycle — TTL, idempotent cleanup, port and network conventions, observable states (create → ready → active → expired → destroyed)
  • Host placement policy — which profiles run on sandboxer01, CoulombCore interim, or other registered hosts; blast-radius isolation from Railiance01 production
  • CLI and/or API — request, inspect, and release sandboxes for operators (adm), agents (agt), and automations (atm)
  • State Hub registration contract — extend the build-agent self-register pattern to generic sandbox identities and lifecycle events
  • Capability registry entries in registry/ for federation via reuse-surface (e.g. capability.execution.sandbox-provision)
  • Runbooks, templates, and tests — Packer/compose bundles, operator runbooks, and automated tests for profile lifecycle
  • Migration path — documented cutover from the-custodian/e2e-framework and infra/build-machines callers to sand-boxer profiles
  • Agent and workplan metadataINTENT.md, AGENTS.md, workplans/, and State Hub progress/decision logging per ADR-001

Out of Scope

Concern Owner
Workstream, task, and progress state state-hub
Cron and event-triggered orchestration activity-core
SSH reverse tunnels and tunnel health ops-bridge
SSH certificate issuance ops-warden
Canon, charters, agent instruction canon the-custodian
Capability index federation hub reuse-surface
Production service deployment on Railiance01 railiance-apps / domain repos
Railiance01 cluster operations railiance-cluster / railiance-infra
ADR-001 workplan ↔ DB reconciliation state-hub (consistency_check.py)

sand-boxer may consume connectivity (ops-bridge) and certificates (ops-warden); it must not duplicate or subsume those authorities.

Additional boundaries:

  • Scheduling — activity-core decides when work runs; sand-boxer decides where isolated execution happens
  • Workstation as runtime — the laptop/WSL anchor is interim control plane, not the target execution surface
  • Irreversible operational decisions — host provisioning, production cutovers, and CA policy changes require human approval

Relevant When

  • An agent or automation needs an isolated environment for coding, building, or testing without laptop filesystem dependence
  • Cross-repo e2e tests need a remote compose sandbox with guaranteed teardown
  • A build or verification workload should run on dedicated hardware (sandboxer01) rather than Railiance01 production or the workstation
  • activity-core or CI needs a bounded execution venue with State Hub visibility
  • Planning reuse of sandbox provisioning across repos (registry-first discovery)

Not Relevant When

  • All work runs locally with acceptable blast radius
  • Only tunnel connectivity is needed (use ops-bridge directly)
  • Only task/workstream state is needed (use state-hub)
  • Only scheduling or rule evaluation is needed (use activity-core)
  • Deploying or operating production services on Railiance01

Current State

  • Status: bootstrap — repo registered with State Hub; charter written; implementation not started
  • Implementation: ~0% — no CLI, API, profiles, provisioner, or tests in tree
  • Docs: INTENT.md (charter, 2026-06-21); README.md (one-liner); AGENTS.md and .custodian-brief.md (State Hub integration, generated)
  • Registry: scaffold present (registry/indexes/capabilities.yaml empty; registry/capabilities/ placeholder); domain in index still helix_forge from scaffold — needs alignment to infotech
  • Workplans: SAND-WP-0001 finished; SAND-WP-0002 active; SAND-WP-0008 proposed (host telemetry / self-canary)
  • Lineage (external, not yet migrated): the-custodian/e2e-framework/ (CUST-WP-0028, completed) and infra/build-machines/ (CUST-WP-0032)

What Is Possible Now

  • Read the charter (INTENT.md) and integration instructions (AGENTS.md)
  • Track bootstrap tasks via workplans/SAND-WP-0001-statehub-bootstrap.md
  • Log progress and decisions to State Hub when the hub is reachable
  • Use interim sandbox execution via the-custodian directly:
    • make e2e REPO=<repo> (e2e-framework on railiance01/CoulombCore)
    • infra/build-machines/ Packer VMs with build-agent registration

Nothing in this repo provisions or manages sandboxes yet.


What Is Not Possible Yet

  • Request a sandbox through sand-boxer CLI or API
  • Select a named, versioned profile from this repo's catalog
  • Register capability.execution.sandbox-provision (index entry absent)
  • Automatic lifecycle registration of generic sandbox identities in State Hub
  • Host placement on sandboxer01 via sand-boxer policy (host may not exist yet)
  • activity-core or agents invoking sand-boxer without workstation repo paths
  • Local install/test/lint/build commands documented for this repo (no package layout yet)

How It Fits

flowchart LR
  AC[activity-core] -->|when| SB[sand-boxer]
  AGT[agents / atm] -->|request sandbox| SB
  SB -->|provision / teardown| HOST[sandboxer01 / interim host]
  SB -->|lifecycle events| SH[state-hub]
  SB -->|reachability| OB[ops-bridge]
  SB -->|SSH identity| OW[ops-warden]
  RS[reuse-surface] -->|federate| REG[registry/]
  TC[the-custodian e2e + build-machines] -.->|migrate from| SB
  • Upstream dependencies: ops-bridge (tunnels), ops-warden (certs, optional), State Hub (registration API), registered sandbox hosts (SSH + Docker/Packer)
  • Downstream consumers: LLM agents, activity-core instructions, CI hooks, cross-repo e2e callers migrating off the-custodian
  • Often used with: activity-core (orchestration), state-hub (visibility), reuse-surface (capability discovery)

Terminology

  • Profile — named, versioned sandbox type with provision/teardown contract
  • Sandbox — a running isolated environment instance of a profile
  • Host placement — policy mapping profiles to sandboxer01, CoulombCore, etc.
  • TTL — time-to-live; sandboxes are disposable by default
  • Phone home — reachability and registration via ops-bridge + State Hub
  • Actor types (consumers): adm (operator), agt (LLM agent), atm (automation)

  • the-custodian — current home of e2e-framework and build-machines; governance canon; sand-boxer extracts reusable execution platform from here
  • ops-bridge — SSH reverse tunnels; sand-boxer orchestrates reachability, does not run tunnel daemons
  • ops-warden — SSH CA and certificate issuance
  • state-hub — workstream/task state and sandbox lifecycle visibility
  • activity-core — schedules work; may request sandboxes as execution venue
  • reuse-surface — federates registry/ capability entries
  • railiance-cluster / railiance-apps — production layer; explicitly not sandbox execution surface

Provided Capabilities

Planned — not yet registered in registry/indexes/capabilities.yaml.

type: execution
title: Sandbox provisioning
description: Isolated execution environments for agentic development, e2e testing, and bounded automations — profile-based provision, TTL teardown, and State Hub lifecycle registration.
keywords: [sandbox, isolation, provision, e2e, agentic, execution, profile]

Target registry id: capability.execution.sandbox-provision (or equivalent per reuse-surface naming).


Getting Oriented

  • Start with: INTENT.md (meta-framework charter)
  • Research: research/ (landscape, reference systems, design synthesis)
  • Agent instructions: AGENTS.md (State Hub session protocol)
  • Offline brief: .custodian-brief.md
  • Workplans: workplans/ (bootstrap: SAND-WP-0001)
  • Registry authoring: registry/README.md
  • Lineage reference (external): the-custodian/e2e-framework/RUNBOOK.md, the-custodian/infra/build-machines/README.md