Files

2026-06-22 21:42:09 +02:00

12 KiB

Raw Blame History

domain, repo, updated

domain	repo	updated
infotech	wise-validator	2026-06-22

INTENT

wise-validator is the Coulomb meta-framework for validation — e2e test orchestration, health checks, and structured pass/fail reporting — while consuming sand-boxer for isolated execution environments. This file is preliminary; refine as the validation boundary is implemented.

Why it exists

Custodian repos need a consistent way to prove that real stacks work: compose services up, health endpoints ready, tests executed, results reported — without every domain reinventing remote SSH scripts, port polling, and State Hub events.

Today that story is split awkwardly in the-custodian/e2e-framework/: provision and teardown live beside test orchestration and reporting in one package. sand-boxer is extracting the establishment half. wise-validator owns the validation half.

The industry validates agent and service work through overlapping but incompatible paths: repo-local pytest, CI compose jobs, Blitzy-style environment validation, hosted sandbox test runners, and ad hoc health probes. Coulomb needs one validation vocabulary that any repo, agent harness, or automation can invoke — without owning sandboxes, agent gateways, or code generation.

wise-validator exists to be that layer: unified validation orchestration with extensions for probe types, test runners, and reporters — environments requested from sand-boxer, not provisioned here.

sand-boxer establishes the box. wise-validator proves what runs inside it.

The governing principle

wise-validator is the validation orchestration service — contracts, health semantics, test execution, pass/fail interpretation, and result emission. Nothing more.

It answers:

What is being validated? Repo e2e/e2e.yml (or successor contract), health probe sets, validation profile selection.
When is the environment ready? Health polling rules, timeouts, partial failure semantics.
How are tests run? Test command dispatch, output capture, exit code interpretation.
Did it pass? Aggregate pass/fail, duration, error surfaces.
Who needs to know? State Hub progress, CI artifacts, activity-core hooks.
Where does it run? By requesting sand-boxer — not by provisioning hosts.

It must not become the sandbox provisioner, the agent harness, the code generator, the scheduler, work-state authority, tunnel/CA owner, or production deployer on Railiance01.

Coulomb sibling boundaries

sand-boxer — sandbox establishment

sand-boxer owns: Profiles, extensions, provision/teardown, placement, lifecycle registration.

wise-validator owns: Requesting profile.compose-e2e (or successors), running validation inside the returned environment, releasing the sandbox when the validation workflow completes (via sand-boxer destroy).

wise-validator                         sand-boxer
──────────────                         ──────────
resolve e2e.yml + validation profile → POST /v1/sandboxes
health-wait + test_command in env    ← sandbox_id + reachability
POST result to State Hub / CI        → destroy (per cleanup policy)

sand-boxer smoke tests may prove an environment exists; wise-validator owns whether that environment passes validation.

glas-harness — agent harness

glas-harness owns: Agent sessions, tools, memory, channels, sandbox policy for agent tool execution.

wise-validator does not run agent gateways. An agent may trigger a validation run (e.g. after a coding session); wise-validator executes the validation workflow as atm automation.

snuggle-inventor — code generation

snuggle-inventor owns: Code generation, tech specs, PR output, human review.

wise-validator does not judge generated code quality beyond configured test and health contracts. snuggle-inventor (or CI) may invoke wise-validator after generation; wise-validator runs the repo's declared tests, not semantic code review.

Boundary diagram

  activity-core / CI / glas-harness (trigger)
                    │
                    ▼
             wise-validator
             (validate)
                    │
        request ────┼──── report
                    ▼
              sand-boxer          state-hub / CI
           (establish env)        (results)

Existing Custodian repos

Concern	Owner
Workstream, task, progress state	`state-hub`
Cron and orchestration	`activity-core`
SSH reverse tunnels	`ops-bridge`
SSH certificate issuance	`ops-warden`
Canon and agent instruction canon	`the-custodian`
Capability federation hub	`reuse-surface`
Agent runtime	glas-harness
Production on Railiance01	`railiance-apps` / domain repos

wise-validator consumes sand-boxer and emits to State Hub; it does not subsume those authorities.

What it is

wise-validator is a meta-framework with four pillars (preliminary):

1. Unified validation API

One surface for validation runs:

# Conceptual CLI (v0)
validate run --repo /path/to/repo
validate run --repo /path/to/repo --workstream-id <uuid>
validate health --probe-set <id> --target <url>
validate report <run-id>

HTTP (parallel to CLI): POST /v1/validations, GET /v1/validations/{id}.

Consumers: activity-core instructions, CI hooks, glas-harness tool triggers, human operators (adm).

2. Validation contract catalog

Per-repo contract — successor to e2e/e2e.yml:

Field	Owner
`name`, `compose_file`, `test_command`	Repo declares; wise-validator interprets
`health_checks[]` (name, url, timeout)	wise-validator polling semantics
`timeout`, `cleanup`, `env`	wise-validator orchestration rules
Host / SSH / compose project naming	sand-boxer via profile inputs

Validation profiles (wise-validator catalog, distinct from sand-boxer sandbox profiles):

Validation profile	sand-boxer profile	Use
`validation.compose-e2e`	`profile.compose-e2e`	Cross-repo stack e2e
`validation.health-only`	`profile.health-probe` (future)	Liveness without full e2e
`validation.smoke`	`profile.agent-dev` (future)	Post-deploy smoke

3. Extension platform

Extensions delegate validation mechanics:

Extension class	Examples
Health probe	HTTP GET, TCP connect, exec-in-sandbox curl
Test runner	pytest, shell, `uv run`, custom command wrapper
Reporter	State Hub progress, GitHub check, JSON artifact
Contract parser	`e2e.yml` v1, future schema versions

Extensions run inside or against a sand-boxer-established environment; they do not call docker compose on hosts directly (except via sandbox reachability).

4. Result model and observability

Structured run results (successor to RunResult):

run_id, repo, sandbox_id, passed, exit_code, duration_s
Captured stdout/stderr (bounded)
Per health check outcomes
event_type: validation_result (or migrated e2e_result) to State Hub
Actor attribution: atm for automations, adm for manual, agt when agent-triggered

What it is not

Concern	Owner
Sandbox provision, compose up/down on host	sand-boxer
Agent gateway, tools, memory	glas-harness
Code generation, tech specs	snuggle-inventor
When validation is scheduled	`activity-core`
Task/workstream state	`state-hub` (wise-validator emits events only)
Tunnels	`ops-bridge`
Certs	`ops-warden`

Embedding sandbox.provision() in wise-validator recreates the monolith sand-boxer is splitting apart.

Lineage

wise-validator replaces the validation half of the-custodian/e2e-framework/:

Module today	Future owner
`schema.py` — `e2e.yml` parse/validate	wise-validator
`runner.py` — health-wait, test_command, pass/fail	wise-validator (against sand-boxer env)
`reporter.py` — State Hub progress	wise-validator
`sandbox.py` — SSH provision/teardown	sand-boxer `ext.compose-ssh`
`cli.py` — monolithic entry	Split: `sandbox` CLI (sand-boxer), `validate` CLI (wise-validator)
`make e2e REPO=…`	Shim → wise-validator (+ sand-boxer); deprecate direct framework

Reference runbook: the-custodian/e2e-framework/RUNBOOK.md
sand-boxer research: sand-boxer/research/ (sandbox patterns only)

Agent-first use case

README positions wise-validator as agent-first validation: agents and automations need fast, declarative proof that a change works in a real stack — not laptop-only pytest. That does not make wise-validator an agent harness; glas-harness may trigger runs, wise-validator executes them deterministically as atm.

Intended users

Deterministic automations (atm) — activity-core, CI, scheduled health jobs
Human operators (adm) — manual e2e runs, probe debugging
LLM agents (agt) — trigger validation via glas-harness; wise-validator runs the contract, not open-ended agent reasoning
Domain repos — declare e2e/e2e.yml; do not fork validation infrastructure
Extension authors — probe, runner, reporter plugins

Design principles

Validation meta-framework, not monolith — one API; extensions for probes and reporters
sand-boxer for environments — never embed provisioners or host SSH lifecycle
Contract in repo, orchestration in platform — e2e/e2e.yml stays opt-in per repo
Health before tests — explicit polling; fail fast with actionable errors
Cleanup is policy — honor cleanup: always | on_success | never; default teardown via sand-boxer
Observable results — every run emits structured pass/fail to State Hub when reachable
Agent-first, automation-grade — deterministic, idempotent, no LLM in the validation path
Registry-first reuse — register validation capabilities in registry/
Backward compatible migration — e2e.yml v1 compatible with CUST-WP-0028 convention

sand-boxer consumption contract (preliminary)

A standard validation.compose-e2e run:

Load and validate e2e/e2e.yml from repo root
Call sand-boxer create with:
- profile: profile.compose-e2e
- inputs: { repo_ref, compose_file, env }
- consumer: { harness: wise-validator, actor: atm, run_id }
Wait for sand-boxer ready
Ensure compose stack is up inside sandbox (coordination with sand-boxer extension contract — may be extension-owned step or validator-owned via reachability; document in integration spec)
Poll health_checks until pass or timeout
Run test_command; capture output and exit code
Apply cleanup policy → sand-boxer destroy when required
Emit validation_result to State Hub (optional workstream_id)

Open questions (for first workplan):

Who runs docker compose up — sand-boxer extension at provision, or wise-validator as first orchestration step inside ready sandbox?
Reporter event type: keep e2e_result or migrate to validation_result?

Track in docs/integrations/sand-boxer.md (wise-validator or sand-boxer repo).

Near-term outcomes (preliminary)

This charter — INTENT.md aligned with sand-boxer sibling boundaries
Extract schema.py — e2e.yml v1 as canonical contract
sand-boxer integration — consume profile.compose-e2e when sand-boxer SAND-WP-0002 delivers API v0
validate run CLI — health-wait + test + report without embedded provision
Reporter — State Hub progress (port reporter.py behavior)
Registry entry — e.g. capability.validation.compose-e2e
the-custodian shim — make e2e delegates to wise-validator + sand-boxer
Runbook — operator docs successor to e2e-framework RUNBOOK

Planned sand-boxer follow-on: SAND-WP-0003 (wise-validator extraction).

Maturity target

A mature wise-validator is Coulomb's default proof layer:

Any repo with e2e/ can run cross-host validation without the-custodian checkout
activity-core fires validation after deployments or agent work
CI and glas-harness share the same validation API and result schema
Health probes reuse the same extension model as full e2e runs
sand-boxer teardown is reliable because wise-validator always releases environments

sand-boxer establishes the box. glas-harness runs the agent. snuggle-inventor writes the code. wise-validator proves it works.

12 KiB Raw Blame History