feat(WARDEN-WP-0020): T3 — guarded executor (worker now acts, not just plans)

HubClient gains writes (mark_read, send_reply, add_progress). execute_plan/execute_plans run the safe, allowlisted actions autonomously: route_answer (reply with the computed answer + auto mark-read), reply (LLM-drafted body), progress_note, mark_read. Escalated plans and non-auto-executable kinds are left for a human; every action is metadata-only (no secret value read/sent/logged). Deliberate guardrail: propose_catalog_diff and any code/routing change is NOT auto-executed even under full-auto — a bad catalog commit could misroute credentials, so it goes to human review (recoverability over convenience). AUTO_EXECUTABLE is the messaging/hub tier only. `warden worker run --execute` runs the executor (dry-run still default). 7 executor tests (reply+mark, with/without body, escalated skip, catalog-diff-left-for-human, progress, failure-without-crash); 243 pass, lint clean. First live --execute shakedown is the operator's (staged rollout); T4 schedules it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 23:19:13 +02:00
parent d36867f381
commit f8ac55367c
4 changed files with 226 additions and 25 deletions
--- a/workplans/WARDEN-WP-0020-ops-warden-worker.md
+++ b/workplans/WARDEN-WP-0020-ops-warden-worker.md
@@ -103,14 +103,25 @@ state_hub_task_id: "52d281b2-7d48-44f5-b77e-80e3ed500b5f"

 ```task
 id: WARDEN-WP-0020-T03
-status: todo
+status: done
 priority: high
 state_hub_task_id: "3a71965e-42d5-4258-9761-aced804c88e7"
 ```

- [ ] Execute in-scope actions: `warden route/access` answers, drafted replies, mark-read,
-      catalog/playbook diffs (commit + sync). Enforce the allowlist + no-secret invariant in
-      code; per-action progress-event audit; escalation path to a human queue.
+- [x] `HubClient` gained writes (`mark_read`, `send_reply`, `add_progress`); `execute_plan`
+      / `execute_plans` run the **safe, allowlisted** actions — route_answer (reply with the
+      computed answer + auto mark-read), reply (with an LLM-drafted body), progress_note,
+      mark_read. Escalated plans and non-auto-executable kinds are left for a human.
+- [x] **Deliberate guardrail:** `propose_catalog_diff` (and any code/routing change) is NOT
+      auto-executed even under full-auto — a bad catalog commit could misroute credentials,
+      so it goes to human review (recoverability over convenience). AUTO_EXECUTABLE is the
+      messaging/hub tier only. No secret value is ever read, sent, or logged.
+- [x] `warden worker run --execute` runs the guarded executor (dry-run still the default);
+      per-message audit summary. Tests in `tests/test_worker.py` (route_answer reply+mark,
+      reply-with/without-body, escalated skip, catalog-diff left-for-human, progress_note,
+      failure-without-crash). 243 pass, lint clean.
+- Note: first **live** `--execute` shakedown is the operator's (staged rollout: dry-run →
+  manual → scheduled); T4 wraps it on a schedule.

 ### T4 — Scheduled trigger