generated from coulomb/repo-seed
feat(WARDEN-WP-0020): conservative triage tier as the --execute default (Option A)
Per Bernd's call: the guardrails prevent security harm but not LLM content errors, so the worker should triage + draft, not auto-send, until reply quality is proven (matches the build-stage/recoverability posture). run_conservative triages NEW messages into a reviewed digest (state_dir/worker-digest.md) with drafted replies, posts ONE progress note, tracks seen message ids (schedule-safe dedup), and sends NOTHING to other agents / marks nothing read. `warden worker run --execute` now runs this conservative tier; `--full-auto` opts into the auto-send path. Live-verified with the LLM brain on the real inbox: produced a high-quality draft reply to a secrets-engine coordination message and correctly flagged the llm-connect custody request as NEEDS YOU. Conservative mode is safe to schedule (T4). 244 tests, lint clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -10,8 +10,10 @@ from warden.worker import (
|
||||
RuleBrain,
|
||||
WorkerPlan,
|
||||
_extract_json,
|
||||
build_digest,
|
||||
build_plans,
|
||||
render_plans,
|
||||
run_conservative,
|
||||
validate_action,
|
||||
)
|
||||
|
||||
@@ -171,13 +173,39 @@ def test_cli_worker_dry_run(monkeypatch):
|
||||
assert "nothing executed" in r.stdout
|
||||
|
||||
|
||||
def test_cli_worker_execute_runs(monkeypatch):
|
||||
# --execute now runs the guarded executor; empty inbox → clean exit.
|
||||
def test_cli_worker_execute_runs(monkeypatch, tmp_path):
|
||||
# --execute runs the conservative tier; empty inbox → clean exit.
|
||||
monkeypatch.setenv("WARDEN_STATE_DIR", str(tmp_path))
|
||||
monkeypatch.setattr("warden.worker.HubClient.unread", lambda self, to_agent="ops-warden": [])
|
||||
r = runner.invoke(app, ["worker", "run", "--execute"])
|
||||
assert r.exit_code == 0
|
||||
|
||||
|
||||
# --- conservative tier (Option A) --------------------------------------------
|
||||
|
||||
def test_build_digest_shows_drafts_and_escalations():
|
||||
p1 = _plan([PlannedAction(kind="reply", summary="ack", payload={"body": "hello there"})])
|
||||
p2 = _plan([PlannedAction(kind="reply", summary="x", risk="escalate", reason="secret")],
|
||||
message_id="m2")
|
||||
out = build_digest([p1, p2])
|
||||
assert "DRAFT READY" in out and "NEEDS YOU" in out and "hello there" in out
|
||||
|
||||
|
||||
def test_run_conservative_drafts_no_sends_and_dedups(tmp_path):
|
||||
hub = _FakeHub()
|
||||
p = _plan([PlannedAction(kind="route_answer", summary="a", payload={"answer": "the answer"})])
|
||||
run_conservative([p], hub, topic_id="t", state_dir=tmp_path)
|
||||
# never sends to other agents or marks read — only a single progress note
|
||||
assert not any(c[0] in ("reply", "mark_read") for c in hub.calls)
|
||||
assert any(c[0] == "progress" for c in hub.calls)
|
||||
digest = (tmp_path / "worker-digest.md").read_text()
|
||||
assert "the answer" in digest
|
||||
# second run: message already seen → no new progress note (schedule-safe dedup)
|
||||
hub2 = _FakeHub()
|
||||
run_conservative([p], hub2, topic_id="t", state_dir=tmp_path)
|
||||
assert not any(c[0] == "progress" for c in hub2.calls)
|
||||
|
||||
|
||||
# --- executor (T3) -----------------------------------------------------------
|
||||
|
||||
class _FakeHub:
|
||||
|
||||
Reference in New Issue
Block a user