Files
wise-validator/INTENT.md
tegwick 9be1c3028d Clarify INTENT.md: ecosystem validation and one-way sand-boxer dependency
Reframe wise-validator as cross-repo use-case stabilization, document
one-way consumption of sand-boxer, and align maturity target with dormant-path
rot detection across the Coulomb ecosystem.
2026-06-23 21:23:39 +02:00

15 KiB

domain, repo, updated
domain repo updated
infotech wise-validator 2026-06-23

INTENT

wise-validator is the Coulomb meta-framework for validation — cross-repo, use-case-driven end-to-end proof that declared behaviors still work — while consuming sand-boxer for isolated execution environments. sand-boxer is self-sustained and does not depend on wise-validator; the dependency runs one way only.


Why it exists

Custodian repos need a consistent way to prove that real stacks work: compose services up, health endpoints ready, tests executed, results reported — without every domain reinventing remote SSH scripts, port polling, and State Hub events.

Today that story is split awkwardly in the-custodian/e2e-framework/: provision and teardown live beside test orchestration and reporting in one package. sand-boxer is extracting the establishment half. wise-validator owns the validation half.

The industry validates agent and service work through overlapping but incompatible paths: repo-local pytest, CI compose jobs, Blitzy-style environment validation, hosted sandbox test runners, and ad hoc health probes. Coulomb needs one validation vocabulary that any repo, agent harness, or automation can invoke — without owning sandboxes, agent gateways, or code generation.

wise-validator exists to be that layer: unified validation orchestration with extensions for probe types, test runners, and reporters — environments requested from sand-boxer, not provisioned here.

sand-boxer establishes the box. wise-validator proves what runs inside it.

Ecosystem use-case stabilization

Coulomb spans many repos and use cases — not all exercised continuously. Without periodic proof, integrations degenerate silently: APIs drift, compose stacks break, cross-repo assumptions fail, and nobody notices until a dormant path is needed again.

wise-validator exists so the ecosystem as a whole can keep use-case definitions honest:

  • Use-case contracts per repo (e.g. e2e/e2e.yml or successor definitions) declare what “still works” means
  • Cross-repo runs exercise real stacks on isolated hosts via sand-boxer — not laptop-only pytest in isolation
  • Scheduled or on-demand validation (activity-core, CI, operators) catches regression before production or agent work depends on a stale use case
  • Structured results (pass/fail, health outcomes, duration) feed State Hub and automation so degradation is visible, not anecdotal

This is infrastructure for Coulomb-wide confidence, not a feature sand-boxer needs to function. sand-boxer provisions venues; wise-validator audits whether declared use cases still hold across that venue catalog.


The governing principle

wise-validator is the validation orchestration service — contracts, health semantics, test execution, pass/fail interpretation, and result emission. Nothing more.

It answers:

  1. What is being validated? Repo e2e/e2e.yml (or successor contract), health probe sets, validation profile selection.
  2. When is the environment ready? Health polling rules, timeouts, partial failure semantics.
  3. How are tests run? Test command dispatch, output capture, exit code interpretation.
  4. Did it pass? Aggregate pass/fail, duration, error surfaces.
  5. Who needs to know? State Hub progress, CI artifacts, activity-core hooks.
  6. Where does it run? By requesting sand-boxer — not by provisioning hosts.

It must not become the sandbox provisioner, the agent harness, the code generator, the scheduler, work-state authority, tunnel/CA owner, or production deployer on Railiance01.


Coulomb sibling boundaries

sand-boxer — sandbox establishment (upstream; not a dependency of sand-boxer)

sand-boxer owns: Profiles, extensions, provision/teardown, placement, lifecycle registration, host telemetry. It is self-sustained — CLI, canary self-deploy, and lifecycle events work without wise-validator.

wise-validator owns: Requesting profile.compose-e2e (or successors), running validation inside the returned environment, releasing the sandbox when the validation workflow completes (via sand-boxer destroy).

Dependency direction: wise-validator → sand-boxer only. sand-boxer never calls, waits for, or requires wise-validator.

wise-validator                         sand-boxer (independent service)
──────────────                         ──────────────────────────────
resolve use-case contract            → create / destroy (optional client)
health-wait + test_command in env    ← sandbox_id + reachability
POST validation result to State Hub     (sand-boxer emits lifecycle only)

sand-boxer proves an environment exists and is reachable (ready). wise-validator proves declared use cases still pass inside it.

glas-harness — agent harness

glas-harness owns: Agent sessions, tools, memory, channels, sandbox policy for agent tool execution.

wise-validator does not run agent gateways. An agent may trigger a validation run (e.g. after a coding session); wise-validator executes the validation workflow as atm automation.

snuggle-inventor — code generation

snuggle-inventor owns: Code generation, tech specs, PR output, human review.

wise-validator does not judge generated code quality beyond configured test and health contracts. snuggle-inventor (or CI) may invoke wise-validator after generation; wise-validator runs the repo's declared tests, not semantic code review.

Boundary diagram

  activity-core / CI / glas-harness (trigger)
                    │
                    ▼
             wise-validator
             (validate)
                    │
        request ────┼──── report
                    ▼
              sand-boxer          state-hub / CI
           (establish env)        (results)

Existing Custodian repos

Concern Owner
Workstream, task, progress state state-hub
Cron and orchestration activity-core
SSH reverse tunnels ops-bridge
SSH certificate issuance ops-warden
Canon and agent instruction canon the-custodian
Capability federation hub reuse-surface
Agent runtime glas-harness
Production on Railiance01 railiance-apps / domain repos

wise-validator consumes sand-boxer and emits to State Hub; it does not subsume those authorities.


What it is

wise-validator is a meta-framework with four pillars (preliminary):

1. Unified validation API

One surface for validation runs:

# Conceptual CLI (v0)
validate run --repo /path/to/repo
validate run --repo /path/to/repo --workstream-id <uuid>
validate health --probe-set <id> --target <url>
validate report <run-id>

HTTP (parallel to CLI): POST /v1/validations, GET /v1/validations/{id}.

Consumers: activity-core instructions, CI hooks, glas-harness tool triggers, human operators (adm).

2. Validation contract catalog

Per-repo contract — successor to e2e/e2e.yml:

Field Owner
name, compose_file, test_command Repo declares; wise-validator interprets
health_checks[] (name, url, timeout) wise-validator polling semantics
timeout, cleanup, env wise-validator orchestration rules
Host / SSH / compose project naming sand-boxer via profile inputs

Validation profiles (wise-validator catalog, distinct from sand-boxer sandbox profiles):

Validation profile sand-boxer profile Use
validation.compose-e2e profile.compose-e2e Cross-repo stack e2e
validation.health-only profile.health-probe (future) Liveness without full e2e
validation.smoke profile.agent-dev (future) Post-deploy smoke

3. Extension platform

Extensions delegate validation mechanics:

Extension class Examples
Health probe HTTP GET, TCP connect, exec-in-sandbox curl
Test runner pytest, shell, uv run, custom command wrapper
Reporter State Hub progress, GitHub check, JSON artifact
Contract parser e2e.yml v1, future schema versions

Extensions run inside or against a sand-boxer-established environment; they do not call docker compose on hosts directly (except via sandbox reachability).

4. Result model and observability

Structured run results (successor to RunResult):

  • run_id, repo, sandbox_id, passed, exit_code, duration_s
  • Captured stdout/stderr (bounded)
  • Per health check outcomes
  • event_type: validation_result (or migrated e2e_result) to State Hub
  • Actor attribution: atm for automations, adm for manual, agt when agent-triggered

What it is not

Concern Owner
Sandbox provision, compose up/down on host sand-boxer
Agent gateway, tools, memory glas-harness
Code generation, tech specs snuggle-inventor
When validation is scheduled activity-core
Task/workstream state state-hub (wise-validator emits events only)
Tunnels ops-bridge
Certs ops-warden

Embedding sandbox.provision() in wise-validator recreates the monolith sand-boxer is splitting apart. Likewise, sand-boxer must not embed validation logic to “complete” e2e — that would couple establishment to a sibling that should remain optional.


Lineage

wise-validator replaces the validation half of the-custodian/e2e-framework/:

Module today Future owner
schema.pye2e.yml parse/validate wise-validator
runner.py — health-wait, test_command, pass/fail wise-validator (against sand-boxer env)
reporter.py — State Hub progress wise-validator
sandbox.py — SSH provision/teardown sand-boxer ext.compose-ssh
cli.py — monolithic entry Split: sandbox CLI (sand-boxer), validate CLI (wise-validator)
make e2e REPO=… Shim → wise-validator (+ sand-boxer); deprecate direct framework

Reference runbook: the-custodian/e2e-framework/RUNBOOK.md
sand-boxer research: sand-boxer/research/ (sandbox patterns only)

Agent-first use case

README positions wise-validator as agent-first validation: agents and automations need fast, declarative proof that a change works in a real stack — not laptop-only pytest. That does not make wise-validator an agent harness; glas-harness may trigger runs, wise-validator executes them deterministically as atm.


Intended users

  • Deterministic automations (atm) — activity-core, CI, scheduled health jobs
  • Human operators (adm) — manual e2e runs, probe debugging
  • LLM agents (agt) — trigger validation via glas-harness; wise-validator runs the contract, not open-ended agent reasoning
  • Domain repos — declare e2e/e2e.yml; do not fork validation infrastructure
  • Extension authors — probe, runner, reporter plugins

Design principles

  • Validation meta-framework, not monolith — one API; extensions for probes and reporters
  • sand-boxer for environments — never embed provisioners or host SSH lifecycle; sand-boxer remains self-sustained without this repo
  • Use-case contracts, ecosystem scope — validation targets declared cross-repo behaviors, not ad-hoc per-session agent checks
  • Detect dormant-path rot — runs matter even when a use case is not in daily use
  • Contract in repo, orchestration in platforme2e/e2e.yml stays opt-in per repo
  • Health before tests — explicit polling; fail fast with actionable errors
  • Cleanup is policy — honor cleanup: always | on_success | never; default teardown via sand-boxer
  • Observable results — every run emits structured pass/fail to State Hub when reachable
  • Agent-first, automation-grade — deterministic, idempotent, no LLM in the validation path
  • Registry-first reuse — register validation capabilities in registry/
  • Backward compatible migratione2e.yml v1 compatible with CUST-WP-0028 convention

sand-boxer consumption contract (preliminary)

A standard validation.compose-e2e run:

  1. Load and validate e2e/e2e.yml from repo root
  2. Call sand-boxer create with:
    • profile: profile.compose-e2e
    • inputs: { repo_ref, compose_file, env }
    • consumer: { harness: wise-validator, actor: atm, run_id }
  3. Wait for sand-boxer ready
  4. Ensure compose stack is up inside sandbox (coordination with sand-boxer extension contract — may be extension-owned step or validator-owned via reachability; document in integration spec)
  5. Poll health_checks until pass or timeout
  6. Run test_command; capture output and exit code
  7. Apply cleanup policy → sand-boxer destroy when required
  8. Emit validation_result to State Hub (optional workstream_id)

Open questions (for first workplan):

  • Who runs docker compose up — sand-boxer extension at provision, or wise-validator as first orchestration step inside ready sandbox?
  • Reporter event type: keep e2e_result or migrate to validation_result?

Track in docs/integrations/sand-boxer.md (wise-validator or sand-boxer repo).


Near-term outcomes (preliminary)

  1. This charterINTENT.md aligned with sand-boxer sibling boundaries
  2. Extract schema.pye2e.yml v1 as canonical contract
  3. sand-boxer integration — consume profile.compose-e2e when sand-boxer SAND-WP-0002 delivers API v0
  4. validate run CLI — health-wait + test + report without embedded provision
  5. Reporter — State Hub progress (port reporter.py behavior)
  6. Registry entry — e.g. capability.validation.compose-e2e
  7. the-custodian shimmake e2e delegates to wise-validator + sand-boxer
  8. Runbook — operator docs successor to e2e-framework RUNBOOK

Implementation tracked in SAND-WP-0003 (wise-validator extraction; workplan in sand-boxer coordinates migration from e2e-framework). sand-boxer SAND-WP-0002 and SAND-WP-0008 are complete independently of that work.


Maturity target

A mature wise-validator is Coulomb's default proof layer for declared use cases:

  • Any repo with a validation contract (e2e/ or successor) can run cross-host proof without the-custodian checkout
  • activity-core schedules validation so infrequently used use cases do not rot undetected
  • CI and glas-harness share the same validation API and result schema
  • Health probes reuse the same extension model as full e2e runs
  • Environments are released via sand-boxer destroy when validation completes

sand-boxer establishes the box — on its own, without wise-validator. glas-harness runs the agent. snuggle-inventor writes the code. wise-validator proves declared use cases still work across the ecosystem.