12 KiB
domain, repo, updated
| domain | repo | updated |
|---|---|---|
| infotech | wise-validator | 2026-06-22 |
INTENT
wise-validator is the Coulomb meta-framework for validation — e2e test orchestration, health checks, and structured pass/fail reporting — while consuming sand-boxer for isolated execution environments. This file is preliminary; refine as the validation boundary is implemented.
Why it exists
Custodian repos need a consistent way to prove that real stacks work: compose services up, health endpoints ready, tests executed, results reported — without every domain reinventing remote SSH scripts, port polling, and State Hub events.
Today that story is split awkwardly in the-custodian/e2e-framework/: provision
and teardown live beside test orchestration and reporting in one package. sand-boxer
is extracting the establishment half. wise-validator owns the validation
half.
The industry validates agent and service work through overlapping but incompatible paths: repo-local pytest, CI compose jobs, Blitzy-style environment validation, hosted sandbox test runners, and ad hoc health probes. Coulomb needs one validation vocabulary that any repo, agent harness, or automation can invoke — without owning sandboxes, agent gateways, or code generation.
wise-validator exists to be that layer: unified validation orchestration with extensions for probe types, test runners, and reporters — environments requested from sand-boxer, not provisioned here.
sand-boxer establishes the box. wise-validator proves what runs inside it.
The governing principle
wise-validator is the validation orchestration service — contracts, health semantics, test execution, pass/fail interpretation, and result emission. Nothing more.
It answers:
- What is being validated? Repo
e2e/e2e.yml(or successor contract), health probe sets, validation profile selection. - When is the environment ready? Health polling rules, timeouts, partial failure semantics.
- How are tests run? Test command dispatch, output capture, exit code interpretation.
- Did it pass? Aggregate pass/fail, duration, error surfaces.
- Who needs to know? State Hub progress, CI artifacts, activity-core hooks.
- Where does it run? By requesting sand-boxer — not by provisioning hosts.
It must not become the sandbox provisioner, the agent harness, the code generator, the scheduler, work-state authority, tunnel/CA owner, or production deployer on Railiance01.
Coulomb sibling boundaries
sand-boxer — sandbox establishment
sand-boxer owns: Profiles, extensions, provision/teardown, placement, lifecycle registration.
wise-validator owns: Requesting profile.compose-e2e (or successors),
running validation inside the returned environment, releasing the sandbox
when the validation workflow completes (via sand-boxer destroy).
wise-validator sand-boxer
────────────── ──────────
resolve e2e.yml + validation profile → POST /v1/sandboxes
health-wait + test_command in env ← sandbox_id + reachability
POST result to State Hub / CI → destroy (per cleanup policy)
sand-boxer smoke tests may prove an environment exists; wise-validator owns whether that environment passes validation.
glas-harness — agent harness
glas-harness owns: Agent sessions, tools, memory, channels, sandbox policy for agent tool execution.
wise-validator does not run agent gateways. An agent may trigger a
validation run (e.g. after a coding session); wise-validator executes the
validation workflow as atm automation.
snuggle-inventor — code generation
snuggle-inventor owns: Code generation, tech specs, PR output, human review.
wise-validator does not judge generated code quality beyond configured test and health contracts. snuggle-inventor (or CI) may invoke wise-validator after generation; wise-validator runs the repo's declared tests, not semantic code review.
Boundary diagram
activity-core / CI / glas-harness (trigger)
│
▼
wise-validator
(validate)
│
request ────┼──── report
▼
sand-boxer state-hub / CI
(establish env) (results)
Existing Custodian repos
| Concern | Owner |
|---|---|
| Workstream, task, progress state | state-hub |
| Cron and orchestration | activity-core |
| SSH reverse tunnels | ops-bridge |
| SSH certificate issuance | ops-warden |
| Canon and agent instruction canon | the-custodian |
| Capability federation hub | reuse-surface |
| Agent runtime | glas-harness |
| Production on Railiance01 | railiance-apps / domain repos |
wise-validator consumes sand-boxer and emits to State Hub; it does not subsume those authorities.
What it is
wise-validator is a meta-framework with four pillars (preliminary):
1. Unified validation API
One surface for validation runs:
# Conceptual CLI (v0)
validate run --repo /path/to/repo
validate run --repo /path/to/repo --workstream-id <uuid>
validate health --probe-set <id> --target <url>
validate report <run-id>
HTTP (parallel to CLI): POST /v1/validations, GET /v1/validations/{id}.
Consumers: activity-core instructions, CI hooks, glas-harness tool triggers,
human operators (adm).
2. Validation contract catalog
Per-repo contract — successor to e2e/e2e.yml:
| Field | Owner |
|---|---|
name, compose_file, test_command |
Repo declares; wise-validator interprets |
health_checks[] (name, url, timeout) |
wise-validator polling semantics |
timeout, cleanup, env |
wise-validator orchestration rules |
| Host / SSH / compose project naming | sand-boxer via profile inputs |
Validation profiles (wise-validator catalog, distinct from sand-boxer sandbox profiles):
| Validation profile | sand-boxer profile | Use |
|---|---|---|
validation.compose-e2e |
profile.compose-e2e |
Cross-repo stack e2e |
validation.health-only |
profile.health-probe (future) |
Liveness without full e2e |
validation.smoke |
profile.agent-dev (future) |
Post-deploy smoke |
3. Extension platform
Extensions delegate validation mechanics:
| Extension class | Examples |
|---|---|
| Health probe | HTTP GET, TCP connect, exec-in-sandbox curl |
| Test runner | pytest, shell, uv run, custom command wrapper |
| Reporter | State Hub progress, GitHub check, JSON artifact |
| Contract parser | e2e.yml v1, future schema versions |
Extensions run inside or against a sand-boxer-established environment; they
do not call docker compose on hosts directly (except via sandbox reachability).
4. Result model and observability
Structured run results (successor to RunResult):
run_id,repo,sandbox_id,passed,exit_code,duration_s- Captured stdout/stderr (bounded)
- Per health check outcomes
event_type: validation_result(or migratede2e_result) to State Hub- Actor attribution:
atmfor automations,admfor manual,agtwhen agent-triggered
What it is not
| Concern | Owner |
|---|---|
| Sandbox provision, compose up/down on host | sand-boxer |
| Agent gateway, tools, memory | glas-harness |
| Code generation, tech specs | snuggle-inventor |
| When validation is scheduled | activity-core |
| Task/workstream state | state-hub (wise-validator emits events only) |
| Tunnels | ops-bridge |
| Certs | ops-warden |
Embedding sandbox.provision() in wise-validator recreates the monolith
sand-boxer is splitting apart.
Lineage
wise-validator replaces the validation half of the-custodian/e2e-framework/:
| Module today | Future owner |
|---|---|
schema.py — e2e.yml parse/validate |
wise-validator |
runner.py — health-wait, test_command, pass/fail |
wise-validator (against sand-boxer env) |
reporter.py — State Hub progress |
wise-validator |
sandbox.py — SSH provision/teardown |
sand-boxer ext.compose-ssh |
cli.py — monolithic entry |
Split: sandbox CLI (sand-boxer), validate CLI (wise-validator) |
make e2e REPO=… |
Shim → wise-validator (+ sand-boxer); deprecate direct framework |
Reference runbook: the-custodian/e2e-framework/RUNBOOK.md
sand-boxer research: sand-boxer/research/ (sandbox patterns only)
Agent-first use case
README positions wise-validator as agent-first validation: agents and
automations need fast, declarative proof that a change works in a real stack —
not laptop-only pytest. That does not make wise-validator an agent harness;
glas-harness may trigger runs, wise-validator executes them deterministically as
atm.
Intended users
- Deterministic automations (
atm) — activity-core, CI, scheduled health jobs - Human operators (
adm) — manual e2e runs, probe debugging - LLM agents (
agt) — trigger validation via glas-harness; wise-validator runs the contract, not open-ended agent reasoning - Domain repos — declare
e2e/e2e.yml; do not fork validation infrastructure - Extension authors — probe, runner, reporter plugins
Design principles
- Validation meta-framework, not monolith — one API; extensions for probes and reporters
- sand-boxer for environments — never embed provisioners or host SSH lifecycle
- Contract in repo, orchestration in platform —
e2e/e2e.ymlstays opt-in per repo - Health before tests — explicit polling; fail fast with actionable errors
- Cleanup is policy — honor
cleanup: always | on_success | never; default teardown via sand-boxer - Observable results — every run emits structured pass/fail to State Hub when reachable
- Agent-first, automation-grade — deterministic, idempotent, no LLM in the validation path
- Registry-first reuse — register validation capabilities in
registry/ - Backward compatible migration —
e2e.ymlv1 compatible with CUST-WP-0028 convention
sand-boxer consumption contract (preliminary)
A standard validation.compose-e2e run:
- Load and validate
e2e/e2e.ymlfrom repo root - Call sand-boxer
createwith:profile: profile.compose-e2einputs: { repo_ref, compose_file, env }consumer: { harness: wise-validator, actor: atm, run_id }
- Wait for sand-boxer
ready - Ensure compose stack is up inside sandbox (coordination with sand-boxer extension contract — may be extension-owned step or validator-owned via reachability; document in integration spec)
- Poll
health_checksuntil pass or timeout - Run
test_command; capture output and exit code - Apply
cleanuppolicy → sand-boxerdestroywhen required - Emit
validation_resultto State Hub (optionalworkstream_id)
Open questions (for first workplan):
- Who runs
docker compose up— sand-boxer extension at provision, or wise-validator as first orchestration step inside ready sandbox? - Reporter event type: keep
e2e_resultor migrate tovalidation_result?
Track in docs/integrations/sand-boxer.md (wise-validator or sand-boxer repo).
Near-term outcomes (preliminary)
- This charter —
INTENT.mdaligned with sand-boxer sibling boundaries - Extract
schema.py—e2e.ymlv1 as canonical contract - sand-boxer integration — consume
profile.compose-e2ewhen sand-boxer SAND-WP-0002 delivers API v0 validate runCLI — health-wait + test + report without embedded provision- Reporter — State Hub progress (port
reporter.pybehavior) - Registry entry — e.g.
capability.validation.compose-e2e the-custodianshim —make e2edelegates to wise-validator + sand-boxer- Runbook — operator docs successor to e2e-framework RUNBOOK
Planned sand-boxer follow-on: SAND-WP-0003 (wise-validator extraction).
Maturity target
A mature wise-validator is Coulomb's default proof layer:
- Any repo with
e2e/can run cross-host validation withoutthe-custodiancheckout - activity-core fires validation after deployments or agent work
- CI and glas-harness share the same validation API and result schema
- Health probes reuse the same extension model as full e2e runs
- sand-boxer teardown is reliable because wise-validator always releases environments
sand-boxer establishes the box. glas-harness runs the agent. snuggle-inventor writes the code. wise-validator proves it works.