Files

tegwick 9be1c3028d Clarify INTENT.md: ecosystem validation and one-way sand-boxer dependency

Reframe wise-validator as cross-repo use-case stabilization, document
one-way consumption of sand-boxer, and align maturity target with dormant-path
rot detection across the Coulomb ecosystem.

2026-06-23 21:23:39 +02:00

15 KiB

Raw Permalink Blame History

domain, repo, updated

domain	repo	updated
infotech	wise-validator	2026-06-23

INTENT

wise-validator is the Coulomb meta-framework for validation — cross-repo, use-case-driven end-to-end proof that declared behaviors still work — while consuming sand-boxer for isolated execution environments. sand-boxer is self-sustained and does not depend on wise-validator; the dependency runs one way only.

Why it exists

Custodian repos need a consistent way to prove that real stacks work: compose services up, health endpoints ready, tests executed, results reported — without every domain reinventing remote SSH scripts, port polling, and State Hub events.

Today that story is split awkwardly in the-custodian/e2e-framework/: provision and teardown live beside test orchestration and reporting in one package. sand-boxer is extracting the establishment half. wise-validator owns the validation half.

The industry validates agent and service work through overlapping but incompatible paths: repo-local pytest, CI compose jobs, Blitzy-style environment validation, hosted sandbox test runners, and ad hoc health probes. Coulomb needs one validation vocabulary that any repo, agent harness, or automation can invoke — without owning sandboxes, agent gateways, or code generation.

wise-validator exists to be that layer: unified validation orchestration with extensions for probe types, test runners, and reporters — environments requested from sand-boxer, not provisioned here.

sand-boxer establishes the box. wise-validator proves what runs inside it.

Ecosystem use-case stabilization

Coulomb spans many repos and use cases — not all exercised continuously. Without periodic proof, integrations degenerate silently: APIs drift, compose stacks break, cross-repo assumptions fail, and nobody notices until a dormant path is needed again.

wise-validator exists so the ecosystem as a whole can keep use-case definitions honest:

Use-case contracts per repo (e.g. e2e/e2e.yml or successor definitions) declare what “still works” means
Cross-repo runs exercise real stacks on isolated hosts via sand-boxer — not laptop-only pytest in isolation
Scheduled or on-demand validation (activity-core, CI, operators) catches regression before production or agent work depends on a stale use case
Structured results (pass/fail, health outcomes, duration) feed State Hub and automation so degradation is visible, not anecdotal

This is infrastructure for Coulomb-wide confidence, not a feature sand-boxer needs to function. sand-boxer provisions venues; wise-validator audits whether declared use cases still hold across that venue catalog.

The governing principle

wise-validator is the validation orchestration service — contracts, health semantics, test execution, pass/fail interpretation, and result emission. Nothing more.

It answers:

What is being validated? Repo e2e/e2e.yml (or successor contract), health probe sets, validation profile selection.
When is the environment ready? Health polling rules, timeouts, partial failure semantics.
How are tests run? Test command dispatch, output capture, exit code interpretation.
Did it pass? Aggregate pass/fail, duration, error surfaces.
Who needs to know? State Hub progress, CI artifacts, activity-core hooks.
Where does it run? By requesting sand-boxer — not by provisioning hosts.

It must not become the sandbox provisioner, the agent harness, the code generator, the scheduler, work-state authority, tunnel/CA owner, or production deployer on Railiance01.

Coulomb sibling boundaries

sand-boxer — sandbox establishment (upstream; not a dependency of sand-boxer)

sand-boxer owns: Profiles, extensions, provision/teardown, placement, lifecycle registration, host telemetry. It is self-sustained — CLI, canary self-deploy, and lifecycle events work without wise-validator.

wise-validator owns: Requesting profile.compose-e2e (or successors), running validation inside the returned environment, releasing the sandbox when the validation workflow completes (via sand-boxer destroy).

Dependency direction: wise-validator → sand-boxer only. sand-boxer never calls, waits for, or requires wise-validator.

wise-validator                         sand-boxer (independent service)
──────────────                         ──────────────────────────────
resolve use-case contract            → create / destroy (optional client)
health-wait + test_command in env    ← sandbox_id + reachability
POST validation result to State Hub     (sand-boxer emits lifecycle only)

sand-boxer proves an environment exists and is reachable (ready). wise-validator proves declared use cases still pass inside it.

glas-harness — agent harness

glas-harness owns: Agent sessions, tools, memory, channels, sandbox policy for agent tool execution.

wise-validator does not run agent gateways. An agent may trigger a validation run (e.g. after a coding session); wise-validator executes the validation workflow as atm automation.

snuggle-inventor — code generation

snuggle-inventor owns: Code generation, tech specs, PR output, human review.

wise-validator does not judge generated code quality beyond configured test and health contracts. snuggle-inventor (or CI) may invoke wise-validator after generation; wise-validator runs the repo's declared tests, not semantic code review.

Boundary diagram

  activity-core / CI / glas-harness (trigger)
                    │
                    ▼
             wise-validator
             (validate)
                    │
        request ────┼──── report
                    ▼
              sand-boxer          state-hub / CI
           (establish env)        (results)

Existing Custodian repos

Concern	Owner
Workstream, task, progress state	`state-hub`
Cron and orchestration	`activity-core`
SSH reverse tunnels	`ops-bridge`
SSH certificate issuance	`ops-warden`
Canon and agent instruction canon	`the-custodian`
Capability federation hub	`reuse-surface`
Agent runtime	glas-harness
Production on Railiance01	`railiance-apps` / domain repos

wise-validator consumes sand-boxer and emits to State Hub; it does not subsume those authorities.

What it is

wise-validator is a meta-framework with four pillars (preliminary):

1. Unified validation API

One surface for validation runs:

# Conceptual CLI (v0)
validate run --repo /path/to/repo
validate run --repo /path/to/repo --workstream-id <uuid>
validate health --probe-set <id> --target <url>
validate report <run-id>

HTTP (parallel to CLI): POST /v1/validations, GET /v1/validations/{id}.

Consumers: activity-core instructions, CI hooks, glas-harness tool triggers, human operators (adm).

2. Validation contract catalog

Per-repo contract — successor to e2e/e2e.yml:

Field	Owner
`name`, `compose_file`, `test_command`	Repo declares; wise-validator interprets
`health_checks[]` (name, url, timeout)	wise-validator polling semantics
`timeout`, `cleanup`, `env`	wise-validator orchestration rules
Host / SSH / compose project naming	sand-boxer via profile inputs

Validation profiles (wise-validator catalog, distinct from sand-boxer sandbox profiles):

Validation profile	sand-boxer profile	Use
`validation.compose-e2e`	`profile.compose-e2e`	Cross-repo stack e2e
`validation.health-only`	`profile.health-probe` (future)	Liveness without full e2e
`validation.smoke`	`profile.agent-dev` (future)	Post-deploy smoke

3. Extension platform

Extensions delegate validation mechanics:

Extension class	Examples
Health probe	HTTP GET, TCP connect, exec-in-sandbox curl
Test runner	pytest, shell, `uv run`, custom command wrapper
Reporter	State Hub progress, GitHub check, JSON artifact
Contract parser	`e2e.yml` v1, future schema versions

Extensions run inside or against a sand-boxer-established environment; they do not call docker compose on hosts directly (except via sandbox reachability).

4. Result model and observability

Structured run results (successor to RunResult):

run_id, repo, sandbox_id, passed, exit_code, duration_s
Captured stdout/stderr (bounded)
Per health check outcomes
event_type: validation_result (or migrated e2e_result) to State Hub
Actor attribution: atm for automations, adm for manual, agt when agent-triggered

What it is not

Concern	Owner
Sandbox provision, compose up/down on host	sand-boxer
Agent gateway, tools, memory	glas-harness
Code generation, tech specs	snuggle-inventor
When validation is scheduled	`activity-core`
Task/workstream state	`state-hub` (wise-validator emits events only)
Tunnels	`ops-bridge`
Certs	`ops-warden`

Embedding sandbox.provision() in wise-validator recreates the monolith sand-boxer is splitting apart. Likewise, sand-boxer must not embed validation logic to “complete” e2e — that would couple establishment to a sibling that should remain optional.

Lineage

wise-validator replaces the validation half of the-custodian/e2e-framework/:

Module today	Future owner
`schema.py` — `e2e.yml` parse/validate	wise-validator
`runner.py` — health-wait, test_command, pass/fail	wise-validator (against sand-boxer env)
`reporter.py` — State Hub progress	wise-validator
`sandbox.py` — SSH provision/teardown	sand-boxer `ext.compose-ssh`
`cli.py` — monolithic entry	Split: `sandbox` CLI (sand-boxer), `validate` CLI (wise-validator)
`make e2e REPO=…`	Shim → wise-validator (+ sand-boxer); deprecate direct framework

Reference runbook: the-custodian/e2e-framework/RUNBOOK.md
sand-boxer research: sand-boxer/research/ (sandbox patterns only)

Agent-first use case

README positions wise-validator as agent-first validation: agents and automations need fast, declarative proof that a change works in a real stack — not laptop-only pytest. That does not make wise-validator an agent harness; glas-harness may trigger runs, wise-validator executes them deterministically as atm.

Intended users

Deterministic automations (atm) — activity-core, CI, scheduled health jobs
Human operators (adm) — manual e2e runs, probe debugging
LLM agents (agt) — trigger validation via glas-harness; wise-validator runs the contract, not open-ended agent reasoning
Domain repos — declare e2e/e2e.yml; do not fork validation infrastructure
Extension authors — probe, runner, reporter plugins

Design principles

Validation meta-framework, not monolith — one API; extensions for probes and reporters
sand-boxer for environments — never embed provisioners or host SSH lifecycle; sand-boxer remains self-sustained without this repo
Use-case contracts, ecosystem scope — validation targets declared cross-repo behaviors, not ad-hoc per-session agent checks
Detect dormant-path rot — runs matter even when a use case is not in daily use
Contract in repo, orchestration in platform — e2e/e2e.yml stays opt-in per repo
Health before tests — explicit polling; fail fast with actionable errors
Cleanup is policy — honor cleanup: always | on_success | never; default teardown via sand-boxer
Observable results — every run emits structured pass/fail to State Hub when reachable
Agent-first, automation-grade — deterministic, idempotent, no LLM in the validation path
Registry-first reuse — register validation capabilities in registry/
Backward compatible migration — e2e.yml v1 compatible with CUST-WP-0028 convention

sand-boxer consumption contract (preliminary)

A standard validation.compose-e2e run:

Load and validate e2e/e2e.yml from repo root
Call sand-boxer create with:
- profile: profile.compose-e2e
- inputs: { repo_ref, compose_file, env }
- consumer: { harness: wise-validator, actor: atm, run_id }
Wait for sand-boxer ready
Ensure compose stack is up inside sandbox (coordination with sand-boxer extension contract — may be extension-owned step or validator-owned via reachability; document in integration spec)
Poll health_checks until pass or timeout
Run test_command; capture output and exit code
Apply cleanup policy → sand-boxer destroy when required
Emit validation_result to State Hub (optional workstream_id)

Open questions (for first workplan):

Who runs docker compose up — sand-boxer extension at provision, or wise-validator as first orchestration step inside ready sandbox?
Reporter event type: keep e2e_result or migrate to validation_result?

Track in docs/integrations/sand-boxer.md (wise-validator or sand-boxer repo).

Near-term outcomes (preliminary)

This charter — INTENT.md aligned with sand-boxer sibling boundaries
Extract schema.py — e2e.yml v1 as canonical contract
sand-boxer integration — consume profile.compose-e2e when sand-boxer SAND-WP-0002 delivers API v0
validate run CLI — health-wait + test + report without embedded provision
Reporter — State Hub progress (port reporter.py behavior)
Registry entry — e.g. capability.validation.compose-e2e
the-custodian shim — make e2e delegates to wise-validator + sand-boxer
Runbook — operator docs successor to e2e-framework RUNBOOK

Implementation tracked in SAND-WP-0003 (wise-validator extraction; workplan in sand-boxer coordinates migration from e2e-framework). sand-boxer SAND-WP-0002 and SAND-WP-0008 are complete independently of that work.

Maturity target

A mature wise-validator is Coulomb's default proof layer for declared use cases:

Any repo with a validation contract (e2e/ or successor) can run cross-host proof without the-custodian checkout
activity-core schedules validation so infrequently used use cases do not rot undetected
CI and glas-harness share the same validation API and result schema
Health probes reuse the same extension model as full e2e runs
Environments are released via sand-boxer destroy when validation completes

sand-boxer establishes the box — on its own, without wise-validator. glas-harness runs the agent. snuggle-inventor writes the code. wise-validator proves declared use cases still work across the ecosystem.

15 KiB Raw Permalink Blame History