Files
sand-boxer/docs/runbooks/profile-sandbox-canary.md
tegwick c0a9261cdc Implement SAND-WP-0008: host telemetry and self-canary
Add profile.sandbox-canary, HostSnapshot/inventory/stale schemas, SSH
collectors, before/after provision deltas, telemetry export to State Hub
and local JSON, default `sandboxer create` self-deploy, inspect/reap-stale
CLI, runbook, and CoulombCore verification (26 tests pass).
2026-06-23 19:53:51 +02:00

1.7 KiB

Runbook: profile.sandbox-canary

Self-deploy sand-boxer to verify host health and return telemetry.

Quick start

export SANDBOXER_HOST=coulombcore
export SANDBOXER_COMPOSE_CMD=podman-compose   # CoulombCore

sandboxer create          # no args — canary self-deploy + IntrospectionReport

What you get on ready

SandboxStatus.telemetry contains:

  • provision_delta — host load/memory/container counts before vs after
  • inventory — sandbox dirs and compose projects on host
  • stale_candidates — orphans and aged sandboxes (dry-run recommendations)

Human summary prints to stderr:

Telemetry: load Δ +0.12, mem avail Δ -48 MB, stale candidates: 0

Artifacts: ~/.local/share/sandboxer/telemetry/<sandbox_id>.json

Inspect without creating

sandboxer inspect host
sandboxer inspect stale --older-than 24
sandboxer reap-stale --dry-run
sandboxer reap-stale --apply --older-than 48   # destructive — review dry-run first

Destroy

sandboxer destroy <sandbox_id>

Destroy telemetry includes destroy_delta (load recovery after teardown).

Verification checklist (SAND-WP-0008-T10)

  1. sandboxer createready + telemetry.provision_delta
  2. sandboxer inspect host → metrics consistent with create report
  3. Fake stale dir: ssh host 'mkdir -p /tmp/sandboxer/fake99' → appears in inspect stale
  4. sandboxer destroydestroy_delta shows load/mem recovery

Optimization notes (activity-core follow-up)

  • Schedule periodic sandboxer create canary on sandboxer01
  • Reap policy: --older-than 24 with human-approved --apply
  • Disk pressure alerts when disk_root_avail_gb < threshold