diff --git a/INTENT.md b/INTENT.md new file mode 100644 index 0000000..fb153ca --- /dev/null +++ b/INTENT.md @@ -0,0 +1,328 @@ +--- +domain: infotech +repo: wise-validator +updated: "2026-06-22" +--- + +# INTENT + +> wise-validator is the Coulomb **meta-framework for validation** — e2e test +> orchestration, health checks, and structured pass/fail reporting — while +> **consuming** sand-boxer for isolated execution environments. This file is +> preliminary; refine as the validation boundary is implemented. + +--- + +## Why it exists + +Custodian repos need a consistent way to prove that real stacks work: compose +services up, health endpoints ready, tests executed, results reported — without +every domain reinventing remote SSH scripts, port polling, and State Hub events. + +Today that story is split awkwardly in `the-custodian/e2e-framework/`: provision +and teardown live beside test orchestration and reporting in one package. sand-boxer +is extracting the **establishment** half. wise-validator owns the **validation** +half. + +The industry validates agent and service work through overlapping but incompatible +paths: repo-local pytest, CI compose jobs, Blitzy-style environment validation, +hosted sandbox test runners, and ad hoc health probes. Coulomb needs **one +validation vocabulary** that any repo, agent harness, or automation can invoke — +without owning sandboxes, agent gateways, or code generation. + +wise-validator exists to be that layer: **unified validation orchestration** +with extensions for probe types, test runners, and reporters — environments +requested from sand-boxer, not provisioned here. + +sand-boxer establishes the box. wise-validator **proves what runs inside it.** + +--- + +## The governing principle + +wise-validator is the **validation orchestration service** — contracts, health +semantics, test execution, pass/fail interpretation, and result emission. +Nothing more. + +It answers: + +1. **What is being validated?** Repo `e2e/e2e.yml` (or successor contract), + health probe sets, validation profile selection. +2. **When is the environment ready?** Health polling rules, timeouts, partial + failure semantics. +3. **How are tests run?** Test command dispatch, output capture, exit code + interpretation. +4. **Did it pass?** Aggregate pass/fail, duration, error surfaces. +5. **Who needs to know?** State Hub progress, CI artifacts, activity-core hooks. +6. **Where does it run?** By requesting sand-boxer — not by provisioning hosts. + +It must **not** become the sandbox provisioner, the agent harness, the code +generator, the scheduler, work-state authority, tunnel/CA owner, or production +deployer on Railiance01. + +--- + +## Coulomb sibling boundaries + +### sand-boxer — sandbox establishment + +**sand-boxer owns:** Profiles, extensions, provision/teardown, placement, +lifecycle registration. + +**wise-validator owns:** Requesting `profile.compose-e2e` (or successors), +running validation **inside** the returned environment, releasing the sandbox +when the validation workflow completes (via sand-boxer `destroy`). + +```text +wise-validator sand-boxer +────────────── ────────── +resolve e2e.yml + validation profile → POST /v1/sandboxes +health-wait + test_command in env ← sandbox_id + reachability +POST result to State Hub / CI → destroy (per cleanup policy) +``` + +sand-boxer smoke tests may prove an environment exists; wise-validator owns +whether that environment **passes validation**. + +### glas-harness — agent harness + +**glas-harness owns:** Agent sessions, tools, memory, channels, sandbox policy +for agent tool execution. + +**wise-validator does not** run agent gateways. An agent may **trigger** a +validation run (e.g. after a coding session); wise-validator executes the +validation workflow as `atm` automation. + +### snuggle-inventor — code generation + +**snuggle-inventor owns:** Code generation, tech specs, PR output, human review. + +**wise-validator does not** judge generated code quality beyond configured test +and health contracts. snuggle-inventor (or CI) may invoke wise-validator after +generation; wise-validator runs the repo's declared tests, not semantic code review. + +### Boundary diagram + +``` + activity-core / CI / glas-harness (trigger) + │ + ▼ + wise-validator + (validate) + │ + request ────┼──── report + ▼ + sand-boxer state-hub / CI + (establish env) (results) +``` + +### Existing Custodian repos + +| Concern | Owner | +|---------|--------| +| Workstream, task, progress state | `state-hub` | +| Cron and orchestration | `activity-core` | +| SSH reverse tunnels | `ops-bridge` | +| SSH certificate issuance | `ops-warden` | +| Canon and agent instruction canon | `the-custodian` | +| Capability federation hub | `reuse-surface` | +| Agent runtime | **glas-harness** | +| Production on Railiance01 | `railiance-apps` / domain repos | + +wise-validator **consumes** sand-boxer and emits to State Hub; it does not +subsume those authorities. + +--- + +## What it is + +wise-validator is a **meta-framework** with four pillars (preliminary): + +### 1. Unified validation API + +One surface for validation runs: + +```bash +# Conceptual CLI (v0) +validate run --repo /path/to/repo +validate run --repo /path/to/repo --workstream-id +validate health --probe-set --target +validate report +``` + +**HTTP** (parallel to CLI): `POST /v1/validations`, `GET /v1/validations/{id}`. + +Consumers: `activity-core` instructions, CI hooks, glas-harness tool triggers, +human operators (`adm`). + +### 2. Validation contract catalog + +Per-repo contract — successor to `e2e/e2e.yml`: + +| Field | Owner | +|-------|--------| +| `name`, `compose_file`, `test_command` | Repo declares; wise-validator interprets | +| `health_checks[]` (name, url, timeout) | wise-validator polling semantics | +| `timeout`, `cleanup`, `env` | wise-validator orchestration rules | +| Host / SSH / compose project naming | **sand-boxer** via profile inputs | + +**Validation profiles** (wise-validator catalog, distinct from sand-boxer sandbox +profiles): + +| Validation profile | sand-boxer profile | Use | +|--------------------|-------------------|-----| +| `validation.compose-e2e` | `profile.compose-e2e` | Cross-repo stack e2e | +| `validation.health-only` | `profile.health-probe` (future) | Liveness without full e2e | +| `validation.smoke` | `profile.agent-dev` (future) | Post-deploy smoke | + +### 3. Extension platform + +Extensions delegate validation mechanics: + +| Extension class | Examples | +|-----------------|----------| +| **Health probe** | HTTP GET, TCP connect, exec-in-sandbox curl | +| **Test runner** | pytest, shell, `uv run`, custom command wrapper | +| **Reporter** | State Hub progress, GitHub check, JSON artifact | +| **Contract parser** | `e2e.yml` v1, future schema versions | + +Extensions run **inside or against** a sand-boxer-established environment; they +do not call `docker compose` on hosts directly (except via sandbox reachability). + +### 4. Result model and observability + +Structured run results (successor to `RunResult`): + +- `run_id`, `repo`, `sandbox_id`, `passed`, `exit_code`, `duration_s` +- Captured stdout/stderr (bounded) +- Per health check outcomes +- `event_type: validation_result` (or migrated `e2e_result`) to State Hub +- Actor attribution: `atm` for automations, `adm` for manual, `agt` when + agent-triggered + +--- + +## What it is not + +| Concern | Owner | +|---------|--------| +| Sandbox provision, compose up/down on host | **sand-boxer** | +| Agent gateway, tools, memory | **glas-harness** | +| Code generation, tech specs | **snuggle-inventor** | +| When validation is scheduled | `activity-core` | +| Task/workstream state | `state-hub` (wise-validator emits events only) | +| Tunnels | `ops-bridge` | +| Certs | `ops-warden` | + +Embedding `sandbox.provision()` in wise-validator recreates the monolith +sand-boxer is splitting apart. + +--- + +## Lineage + +wise-validator replaces the **validation half** of `the-custodian/e2e-framework/`: + +| Module today | Future owner | +|--------------|--------------| +| `schema.py` — `e2e.yml` parse/validate | **wise-validator** | +| `runner.py` — health-wait, test_command, pass/fail | **wise-validator** (against sand-boxer env) | +| `reporter.py` — State Hub progress | **wise-validator** | +| `sandbox.py` — SSH provision/teardown | **sand-boxer** `ext.compose-ssh` | +| `cli.py` — monolithic entry | Split: `sandbox` CLI (sand-boxer), `validate` CLI (wise-validator) | +| `make e2e REPO=…` | Shim → wise-validator (+ sand-boxer); deprecate direct framework | + +Reference runbook: `the-custodian/e2e-framework/RUNBOOK.md` +sand-boxer research: `sand-boxer/research/` (sandbox patterns only) + +### Agent-first use case + +README positions wise-validator as **agent-first** validation: agents and +automations need fast, declarative proof that a change works in a real stack — +not laptop-only pytest. That does not make wise-validator an agent harness; +glas-harness may trigger runs, wise-validator executes them deterministically as +`atm`. + +--- + +## Intended users + +- **Deterministic automations (`atm`)** — activity-core, CI, scheduled health jobs +- **Human operators (`adm`)** — manual e2e runs, probe debugging +- **LLM agents (`agt`)** — trigger validation via glas-harness; wise-validator + runs the contract, not open-ended agent reasoning +- **Domain repos** — declare `e2e/e2e.yml`; do not fork validation infrastructure +- **Extension authors** — probe, runner, reporter plugins + +--- + +## Design principles + +- **Validation meta-framework, not monolith** — one API; extensions for probes and reporters +- **sand-boxer for environments** — never embed provisioners or host SSH lifecycle +- **Contract in repo, orchestration in platform** — `e2e/e2e.yml` stays opt-in per repo +- **Health before tests** — explicit polling; fail fast with actionable errors +- **Cleanup is policy** — honor `cleanup: always | on_success | never`; default teardown via sand-boxer +- **Observable results** — every run emits structured pass/fail to State Hub when reachable +- **Agent-first, automation-grade** — deterministic, idempotent, no LLM in the validation path +- **Registry-first reuse** — register validation capabilities in `registry/` +- **Backward compatible migration** — `e2e.yml` v1 compatible with CUST-WP-0028 convention + +--- + +## sand-boxer consumption contract (preliminary) + +A standard `validation.compose-e2e` run: + +1. Load and validate `e2e/e2e.yml` from repo root +2. Call sand-boxer `create` with: + - `profile: profile.compose-e2e` + - `inputs: { repo_ref, compose_file, env }` + - `consumer: { harness: wise-validator, actor: atm, run_id }` +3. Wait for sand-boxer `ready` +4. Ensure compose stack is up **inside sandbox** (coordination with sand-boxer + extension contract — may be extension-owned step or validator-owned via + reachability; document in integration spec) +5. Poll `health_checks` until pass or timeout +6. Run `test_command`; capture output and exit code +7. Apply `cleanup` policy → sand-boxer `destroy` when required +8. Emit `validation_result` to State Hub (optional `workstream_id`) + +Open questions (for first workplan): + +- Who runs `docker compose up` — sand-boxer extension at provision, or + wise-validator as first orchestration step inside ready sandbox? +- Reporter event type: keep `e2e_result` or migrate to `validation_result`? + +Track in `docs/integrations/sand-boxer.md` (wise-validator or sand-boxer repo). + +--- + +## Near-term outcomes (preliminary) + +1. **This charter** — `INTENT.md` aligned with sand-boxer sibling boundaries +2. **Extract `schema.py`** — `e2e.yml` v1 as canonical contract +3. **sand-boxer integration** — consume `profile.compose-e2e` when sand-boxer + SAND-WP-0002 delivers API v0 +4. **`validate run` CLI** — health-wait + test + report without embedded provision +5. **Reporter** — State Hub progress (port `reporter.py` behavior) +6. **Registry entry** — e.g. `capability.validation.compose-e2e` +7. **`the-custodian` shim** — `make e2e` delegates to wise-validator + sand-boxer +8. **Runbook** — operator docs successor to e2e-framework RUNBOOK + +Planned sand-boxer follow-on: **SAND-WP-0003** (wise-validator extraction). + +--- + +## Maturity target + +A mature wise-validator is Coulomb's **default proof layer**: + +- Any repo with `e2e/` can run cross-host validation without `the-custodian` checkout +- activity-core fires validation after deployments or agent work +- CI and glas-harness share the same validation API and result schema +- Health probes reuse the same extension model as full e2e runs +- sand-boxer teardown is reliable because wise-validator always releases environments + +sand-boxer establishes the box. glas-harness runs the agent. snuggle-inventor +writes the code. **wise-validator proves it works.** \ No newline at end of file