ops-warden

Author	SHA1	Message	Date
tegwick	d0261ebb52	feat(WARDEN-WP-0020): conservative triage tier as the --execute default (Option A) Per Bernd's call: the guardrails prevent security harm but not LLM content errors, so the worker should triage + draft, not auto-send, until reply quality is proven (matches the build-stage/recoverability posture). run_conservative triages NEW messages into a reviewed digest (state_dir/worker-digest.md) with drafted replies, posts ONE progress note, tracks seen message ids (schedule-safe dedup), and sends NOTHING to other agents / marks nothing read. `warden worker run --execute` now runs this conservative tier; `--full-auto` opts into the auto-send path. Live-verified with the LLM brain on the real inbox: produced a high-quality draft reply to a secrets-engine coordination message and correctly flagged the llm-connect custody request as NEEDS YOU. Conservative mode is safe to schedule (T4). 244 tests, lint clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-30 00:38:36 +02:00
tegwick	f8ac55367c	feat(WARDEN-WP-0020): T3 — guarded executor (worker now acts, not just plans) HubClient gains writes (mark_read, send_reply, add_progress). execute_plan/execute_plans run the safe, allowlisted actions autonomously: route_answer (reply with the computed answer + auto mark-read), reply (LLM-drafted body), progress_note, mark_read. Escalated plans and non-auto-executable kinds are left for a human; every action is metadata-only (no secret value read/sent/logged). Deliberate guardrail: propose_catalog_diff and any code/routing change is NOT auto-executed even under full-auto — a bad catalog commit could misroute credentials, so it goes to human review (recoverability over convenience). AUTO_EXECUTABLE is the messaging/hub tier only. `warden worker run --execute` runs the executor (dry-run still default). 7 executor tests (reply+mark, with/without body, escalated skip, catalog-diff-left-for-human, progress, failure-without-crash); 243 pass, lint clean. First live --execute shakedown is the operator's (staged rollout); T4 schedules it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-29 23:19:13 +02:00
tegwick	859beed07f	feat(WARDEN-WP-0020): T2 — llm-connect brain (autonomous worker now thinks) llm-connect is operational (operator set OPENROUTER_API_KEY). Contract discovered from the running service: POST /execute {"prompt":...} -> {"content":...}. LlmConnectBrain embeds the fixed charter + the inbox message as untrusted data, calls /execute, and parses a JSON action plan (_extract_json tolerates fences/prose), escalating defensively on malformed/empty/transport errors. The build_plans guardrail still enforces the allowlist + no-secret invariant on whatever the model returns — the LLM cannot widen ops-warden's authority. `warden worker run --brain rule\|llm` selects the planner. Live-verified on the real inbox: the LLM brain planned a sensible reply+mark_read for a secrets-engine coordination message and correctly escalated a secret-custody request as out-of-lane — better classification than the deterministic RuleBrain. 6 new tests, 236 pass, lint clean. T3 (guarded executor) and T4 (scheduling) remain. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-29 23:10:28 +02:00
tegwick	4287eccc80	feat(WARDEN-WP-0020): worker drafts real route answers in dry-run (T3 groundwork) build_plans now computes the concrete routing answer for each route_answer action in-process (reuses the catalog; read-only, no subprocess/network) and render_plans shows it as a `draft:` line. The dry-run demonstrates the actual answer the executor (T3) will send, not just an intent. RuleBrain stays the default; the llm-connect brain (T2) is gated on llm-connect being operational + its /execute contract. 230 tests, lint clean. Live dry-run verified against the real inbox. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-29 22:42:54 +02:00
tegwick	211994ddbb	feat(WARDEN-WP-0020): ops-warden coordination worker — T1 dry-run scaffold Foundation for an autonomous worker that handles ops-warden's State Hub coordination lane via llm-connect (Bernd's call: full-auto in-scope + scheduled, staged dry-run -> manual -> scheduled). T1 is the llm-connect-independent, safe slice: src/warden/worker.py — HubClient (read unread to_agent=ops-warden), Brain protocol, deterministic RuleBrain (answers clear routing questions, escalates the rest), PlannedAction/WorkerPlan model, guardrail allowlist + validate_action enforced brain-agnostically (no-secret invariant + prod-config + off-allowlist all escalate), render_plans dry-run output. `warden worker run --dry-run` (default); --execute refused (exit 2) until the guarded executor (T3) lands. Guardrails are load-bearing because full-auto has no human in the loop: message content is untrusted data, the allowlist is enforced regardless of what the brain proposes. Hard dependency flagged in the workplan: the brain is llm-connect, which needs its provider key (OPENROUTER_API_KEY, deferred CCR-2026-0003) before it can run. 18 worker tests; 229 pass, lint clean. Live dry-run against the real hub verified. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-29 19:07:06 +02:00

5 Commits