From 9d1a35e61b0a4042a0efc9aa26ef05b73994be1c Mon Sep 17 00:00:00 2001 From: tegwick Date: Tue, 2 Jun 2026 02:16:50 +0200 Subject: [PATCH] Note 2026-06-01 cutover-runbook session on CUST-WP-0045 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit T06 remains in_progress — no canary was rerun. Capture the runbook deliverable (workplans/CUST-WP-0045-cutover-runbook.md @ 8ef5399), the still-unchanged upstream fixes that should let the patched canary succeed, and the two operational gotchas the runbook now documents (host-mode env overrides vs. Docker-network .env; Claude CLI quota collision when triggering from inside an active Claude Code session). Bump updated: to 2026-06-01. Co-Authored-By: Claude Opus 4.7 --- ...-0045-activity-core-daily-triage-runner.md | 50 ++++++++++++++++++- 1 file changed, 49 insertions(+), 1 deletion(-) diff --git a/workplans/CUST-WP-0045-activity-core-daily-triage-runner.md b/workplans/CUST-WP-0045-activity-core-daily-triage-runner.md index b933a2e..ca0db9a 100644 --- a/workplans/CUST-WP-0045-activity-core-daily-triage-runner.md +++ b/workplans/CUST-WP-0045-activity-core-daily-triage-runner.md @@ -10,7 +10,7 @@ topic_slug: custodian planning_priority: high planning_order: 45 created: "2026-05-19" -updated: "2026-05-23" +updated: "2026-06-01" state_hub_workstream_id: "d9d9a3ec-f736-4041-beac-bb92c7ad314e" --- @@ -508,6 +508,54 @@ The State Hub decision recorded under `CUST-WP-0046` is to keep the Codex automation active until this workplan completes its own daily WSJF canary and explicit pause/delete cutover step. +## Implementation Notes - 2026-06-01 + +T06 remains `in_progress`. No canary was rerun in this session. The blocker +is the same as on 2026-05-21: the patched real-LLM probe was never executed +after the Claude CLI session limit cleared. Eleven days passed without a +retry; the activity-core dev stack is currently down (only `infra-postgres-1` +from another project is running on the workstation). + +The fixes that should make the canary succeed are merged and unchanged: + +- activity-core `cf92f0d` — forward instruction `output_schema` to llm-connect + as `model_params.json_schema` +- activity-core `5c4f96e` — pass `temperature`, `max_tokens`, `max_depth`, + `model_params` through to llm-connect `RunConfig` +- llm-connect `b12d1af` — Claude Code adapter maps `json_schema` to native + `--json-schema` CLI option +- llm-connect `82e3c07` — server mode preserves the full `RunConfig` + +Session deliverable: a separate cutover runbook capturing the exact host-mode +command sequence to bring the dev stack up, sync the ActivityDefinition from +`ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian`, run the smoke probe, +trigger the canary, verify all three evidence surfaces, and only then flip +`enabled: true` and sync schedules: + +- file: `workplans/CUST-WP-0045-cutover-runbook.md` +- commit: `8ef5399 Add CUST-WP-0045 T06 cutover runbook` + +The runbook also documents two operational gotchas the 2026-05-21 attempts hit: + +- The repo `.env` uses Docker network hostnames (`temporal:7233`, + `app-db:5432`, `nats:4222`); host-mode worker/API processes must override + these on the command line because `make` auto-loads the file. +- Triggering the canary from inside an active Claude Code session shares the + CLI quota with llm-connect's Claude Code adapter and can reproduce the + `Execution error` / HTTP 500 failure. Run from a fresh terminal account. + +The current `daily_triage_digest` (built with the live State Hub and the +ActivityDefinition's params) was inspected: 10,175 bytes, totals across 13 +active topics / 13 active workstreams / 149 todo tasks, 12 open workstreams +in the digest with `hf-wp-0001`, `cust-wp-0044`, `cust-wp-0045`, `cust-wp-0046` +at the priority head. This is substantive context — a working real-LLM canary +should produce non-trivial recommendations, not the `"summary":"ok"` stub from +the 2026-05-19 mocked run. + +T07 and T08 remain `todo`. The cutover runbook overlaps with T07's evidence +queries but does not cover T07's steady-state "did it run today?" framing or +the missed-run policy documentation, so T07 is not satisfied by this session. + ## Acceptance Criteria - The daily State Hub WSJF triage runs from activity-core, not Codex app cron.