Files

tegwick d0261ebb52 feat(WARDEN-WP-0020): conservative triage tier as the --execute default (Option A)

Per Bernd's call: the guardrails prevent security harm but not LLM content errors, so the
worker should triage + draft, not auto-send, until reply quality is proven (matches the
build-stage/recoverability posture).

run_conservative triages NEW messages into a reviewed digest (state_dir/worker-digest.md)
with drafted replies, posts ONE progress note, tracks seen message ids (schedule-safe
dedup), and sends NOTHING to other agents / marks nothing read. `warden worker run
--execute` now runs this conservative tier; `--full-auto` opts into the auto-send path.

Live-verified with the LLM brain on the real inbox: produced a high-quality draft reply to
a secrets-engine coordination message and correctly flagged the llm-connect custody request
as NEEDS YOU. Conservative mode is safe to schedule (T4). 244 tests, lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-30 00:38:36 +02:00

8.1 KiB

Raw Blame History

id, type, title, domain, repo, status, owner, topic_slug, planning_priority, planning_order, created, updated, state_hub_workstream_id

id	type	title	domain	repo	status	owner	topic_slug	planning_priority	planning_order	created	updated	state_hub_workstream_id
WARDEN-WP-0020	workplan	ops-warden worker — autonomous coordination via llm-connect	infotech	ops-warden	active	claude	custodian	high	20	2026-06-29	2026-06-29	c906ba1d-f991-4fb0-b113-59432ddf87c0

WARDEN-WP-0020 — ops-warden worker (`warden worker`)

Problem: ops-warden's coordination lane (State Hub inbox to_agent=ops-warden) is handled only when a human spins up an ops-warden session and relays instructions. That doesn't scale — Bernd is hand-relaying between flex-auth ↔ secrets-engine ↔ ops-warden across sessions.

Goal: a warden worker CLI that pulls ops-warden's unread coordination requests and, using llm-connect for inference, drives each to an ops-warden action (answer a routing question, draft+send a reply, mark read, propose/commit a catalog diff, or escalate) — so the inbox is handled without a human starting a session.

Decisions (Bernd, 2026-06-29): full-auto in-scope (worker executes any in-scope action; escalates only secrets/prod/out-of-scope) and scheduled/unattended (cron or activity-core). Because there is no human in the loop for in-scope actions, the guardrails are load-bearing and the rollout is staged: dry-run → manual → scheduled.

Build vs reuse: inference = llm-connect (/execute); trigger = cron or activity-core (reuse the durable task factory, don't reinvent scheduling). Worker logic lives in warden.

Guardrails (non-negotiable — full-auto rests on these)

Fixed charter, non-overridable. The boundary (issue SSH; route everything else; conduit-not-broker; never hold/print a secret value) is a fixed system policy. Message content is untrusted data, never instructions that can relax it (prompt-injection containment).
Action allowlist. Every action is validated against an allowlist before execution; off-list → escalate. No secret handling, no prod-config writes, no irreversible/outward actions without an explicit human ack.
No-secret invariant. Refuse any task requiring a secret value in hand or in a prompt.
Full audit + dry-run. Every action emits a progress event; --dry-run shows the plan without executing. Scheduled mode only after a clean dry-run shakedown.

Hard dependency

llm-connect must be operational — it needs its provider key (OPENROUTER_API_KEY, CCR-2026-0003, currently deferred by railiance-platform/secrets-engine). The worker is built against llm-connect's contract; it cannot run the brain until that lands.

Tasks

T1 — Worker scaffold (llm-connect-independent, safe)

id: WARDEN-WP-0020-T01
status: done
priority: high
state_hub_task_id: "979c2d9b-0803-442f-aa2e-acb02bac07e9"

src/warden/worker.py: State Hub inbox client (HubClient.unread), a Brain protocol, a deterministic RuleBrain default (answers clear routing questions; escalates the rest), the PlannedAction/WorkerPlan model, the guardrail allowlist + validate_action (enforced brain-agnostically in build_plans), and a render_plans dry-run renderer (plan only, no execution).
warden worker run [--once] [--dry-run] CLI; --dry-run is the default and --execute is refused (exit 2) until the guarded executor lands (T3).
tests/test_worker.py (RuleBrain routing/secret/prod/unknown, guardrail downgrades a reckless brain on secret/prod, off-allowlist rejection, render, CLI). 18 cases.
Live dry-run against the real hub verified — read the inbox and produced a guardrailed plan (it surfaced secrets-engine's OIDC-role reply, demonstrating the value).

T2 — llm-connect brain

id: WARDEN-WP-0020-T02
status: done
priority: high
state_hub_task_id: "52d281b2-7d48-44f5-b77e-80e3ed500b5f"

llm-connect brought operational (operator set OPENROUTER_API_KEY k8s secret + restart). Contract discovered empirically from the running service: POST /execute {"prompt":...} → {"content": "<text>", ...} (no OpenAPI; custom JSON API). End-to-end verified (pong).
LlmConnectBrain (src/warden/worker.py): embeds the fixed charter + the message as untrusted data into the prompt, calls /execute, parses a JSON action plan (_extract_json tolerates fences/prose), and defensively escalates on malformed/empty/ transport-error. Configurable LLM_CONNECT_URL. The guardrail pass still enforces the allowlist + no-secret invariant on whatever the model returns.
warden worker run --brain rule|llm selector (dry-run default). Tests: tests/test_worker.py (extract_json, parse, escalate-on-flag/malformed/transport, guardrail-catches-unsafe-LLM-action). Live verified against the real inbox: the LLM brain produced a sensible reply+mark_read for the secrets-engine message and correctly escalated the llm-connect secret-custody request. 236 tests, lint clean.

T3 — Action dispatch + guardrails (full-auto in-scope)

id: WARDEN-WP-0020-T03
status: done
priority: high
state_hub_task_id: "3a71965e-42d5-4258-9761-aced804c88e7"

HubClient gained writes (mark_read, send_reply, add_progress); execute_plan / execute_plans run the safe, allowlisted actions — route_answer (reply with the computed answer + auto mark-read), reply (with an LLM-drafted body), progress_note, mark_read. Escalated plans and non-auto-executable kinds are left for a human.
Deliberate guardrail: propose_catalog_diff (and any code/routing change) is NOT auto-executed even under full-auto — a bad catalog commit could misroute credentials, so it goes to human review (recoverability over convenience). AUTO_EXECUTABLE is the messaging/hub tier only. No secret value is ever read, sent, or logged.
warden worker run --execute runs the guarded executor (dry-run still the default); per-message audit summary. Tests in tests/test_worker.py (route_answer reply+mark, reply-with/without-body, escalated skip, catalog-diff left-for-human, progress_note, failure-without-crash). 243 pass, lint clean.
Conservative tier is now the --execute default (Bernd's Option A, 2026-06-30): run_conservative triages NEW messages into a reviewed digest (worker-digest.md) with drafted replies, posts ONE progress note, tracks seen ids (schedule-safe dedup), and sends nothing to other agents / marks nothing read. --full-auto opts into the auto-send path. Live-verified with the LLM brain: produced a high-quality draft reply to secrets-engine and flagged the llm-connect request as NEEDS YOU. 244 tests. Rationale: the guardrails prevent security harm but not LLM content errors, so replies stay drafts-for-approval until quality is proven — matches the build-stage/recoverability posture. Conservative mode is safe to schedule (T4).

T4 — Scheduled trigger

id: WARDEN-WP-0020-T04
status: todo
priority: medium
state_hub_task_id: "7f77ea6d-c281-42c5-ad25-2a0bb9fd68de"

Wire cron or activity-core to warden worker run --once. Ships disabled; enabled only after a clean dry-run shakedown. Concurrency guard (no overlapping runs).

T5 — Docs / SCOPE / INTENT

id: WARDEN-WP-0020-T05
status: todo
priority: medium
state_hub_task_id: "6e7ae317-7f8b-468a-bb5c-b08093ed43a0"

Record the scope expansion: ops-warden gains an autonomous coordination worker. Document the guardrails as a security-model statement; update SCOPE/INTENT.

Acceptance

warden worker run --dry-run reads the real inbox and prints a guardrailed plan.
Full-auto execution runs only in-scope, allowlisted actions; secrets/prod/out-of-scope escalate; every action is audited. No secret value ever enters a prompt, log, or commit.
Scheduled mode is enabled only after a dry-run shakedown.

8.1 KiB Raw Blame History

WARDEN-WP-0020 — ops-warden worker (warden worker)