Implement SAND-WP-0008: host telemetry and self-canary

Add profile.sandbox-canary, HostSnapshot/inventory/stale schemas, SSH
collectors, before/after provision deltas, telemetry export to State Hub
and local JSON, default `sandboxer create` self-deploy, inspect/reap-stale
CLI, runbook, and CoulombCore verification (26 tests pass).
This commit is contained in:
2026-06-23 19:53:51 +02:00
parent 582c1dd3c6
commit c0a9261cdc
22 changed files with 1047 additions and 26 deletions

95
docs/host-telemetry.md Normal file
View File

@@ -0,0 +1,95 @@
# Host telemetry contract
Version 0.1 — SAND-WP-0008. Extends `docs/meta-framework.md` Host resource with
read-only observability. sand-boxer collects and exports telemetry; it does not
own long-term metrics storage.
---
## Types
### HostSnapshot
Point-in-time host metrics collected over SSH (≤10s, non-root-safe).
| Field | Description |
|-------|-------------|
| `load_1m`, `load_5m`, `load_15m` | `/proc/loadavg` |
| `cpu_count` | Logical CPUs |
| `mem_total_mb`, `mem_available_mb` | From `free -m` |
| `disk_root_used_pct`, `disk_root_avail_gb` | Root filesystem |
| `running_containers` | All running containers (podman/docker) |
| `sandbox_containers` | Containers with `sbx-*` compose project label |
### SandboxInventory
Known sandbox artifacts on a host.
| Entry type | Source |
|------------|--------|
| `directory` | `{base_dir}/{sandbox_id}` |
| `compose_project` | `sbx-*` or legacy `e2e-*` compose labels |
Each entry: `id`, `path`, `age_hours`, `profile_hint` (inferred from project name).
### StaleCandidate
| Kind | Meaning | Suggested action |
|------|---------|------------------|
| `orphan_dir` | Dir on host, not in local store | `reap` |
| `orphan_compose` | Compose project on host, not in store | `reap` |
| `zombie_record` | Store record not `destroyed`, missing on host | `inspect` |
| `aged_dir` | Dir older than threshold | `reap` |
Actions: `reap`, `inspect`, `ignore`. Automatic reap requires `--apply` on CLI.
### ProvisionDelta
`before` and `after` HostSnapshot pair with computed deltas:
- `load_1m_delta`, `mem_available_mb_delta`, `running_containers_delta`
### IntrospectionReport
Bundled canary output attached to `SandboxStatus.telemetry` on `ready`:
```json
{
"schema_version": "0.1",
"host": "92.205.130.254",
"sandbox_id": "abc12345",
"profile_id": "profile.sandbox-canary",
"collected_at": "2026-06-23T...",
"provision_delta": { "before": {}, "after": {}, "load_1m_delta": 0.1 },
"inventory": { "entries": [], "host": "..." },
"stale_candidates": []
}
```
---
## Privacy and retention
- No secret paths, env files, or full `docker inspect` dumps
- Telemetry JSON retained locally under `~/.local/share/sandboxer/telemetry/`
- State Hub events include report in `detail` — same redaction rules apply
- Operators may set `SANDBOXER_NO_STATE_HUB=1` to skip remote emission
---
## Export sinks
| Sink | Status |
|------|--------|
| State Hub `progress/` | Implemented |
| Local JSON artifact | Implemented |
| `TelemetrySink` protocol | Stub for artifact-store / Prometheus / ClickHouse |
---
## Profile trigger
Telemetry collection runs when:
- Profile id is `profile.sandbox-canary`, or
- `profile.metadata.observability` is `canary`