721 lines
30 KiB
Markdown
721 lines
30 KiB
Markdown
---
|
|
id: CUST-WP-0045
|
|
type: workplan
|
|
title: "Activity-Core Daily Triage Runner Cutover"
|
|
domain: custodian
|
|
repo: the-custodian
|
|
status: finished
|
|
owner: custodian
|
|
topic_slug: custodian
|
|
planning_priority: high
|
|
planning_order: 45
|
|
created: "2026-05-19"
|
|
updated: "2026-06-04"
|
|
state_hub_workstream_id: "d9d9a3ec-f736-4041-beac-bb92c7ad314e"
|
|
---
|
|
|
|
# CUST-WP-0045 - Activity-Core Daily Triage Runner Cutover
|
|
|
|
## Goal
|
|
|
|
Move the Daily State Hub WSJF Triage runner from the Codex app automation
|
|
substrate to owned activity-core infrastructure.
|
|
|
|
The outcome should be a reliable daily run at 07:20 Europe/Berlin that produces
|
|
the same review artifact promised by `CUST-WP-0044`: a dated working-memory
|
|
note, a State Hub `daily_triage` progress event, and an auditable activity-core
|
|
run record.
|
|
|
|
## Context
|
|
|
|
On 2026-05-19 the Codex app automation fired at the scheduled time, but did not
|
|
complete a useful run:
|
|
|
|
- two `Daily State Hub WSJF Triage` sessions were created at 07:20 Europe/Berlin
|
|
- both session files contained only session metadata
|
|
- no prompt execution, report, tool call, working-memory note, or final answer
|
|
was recorded
|
|
- State Hub had no `daily_triage` progress event for that date
|
|
- the recorded session cwd values used Windows-style `C:\home\worsch\...`
|
|
paths rather than the intended WSL paths
|
|
|
|
This shows the schedule is present but the launch substrate is not trustworthy
|
|
enough for an unattended Custodian operating habit.
|
|
|
|
activity-core already provides the pieces that should own this class of work:
|
|
|
|
- Temporal cron schedules with timezone and misfire-policy handling
|
|
- `ActivityDefinition` markdown ingestion via `ACTIVITY_DEFINITION_DIRS`
|
|
- `state-hub` context resolver hooks
|
|
- ActivityRun logging and Temporal workflow history
|
|
- rule/instruction model design in `ACT-ADR-003`
|
|
- deployment/runbook paths for the Railiance environment
|
|
|
|
The missing work is to connect those existing capabilities to this judgement
|
|
report use case without building a second scheduler or a parallel priority
|
|
database.
|
|
|
|
## Scope
|
|
|
|
In scope:
|
|
|
|
- Extend activity-core so the existing daily triage ActivityDefinition can run
|
|
as the primary scheduler.
|
|
- Reuse the existing prompt at
|
|
`runtime/prompts/daily_statehub_wsgi_triage.md`.
|
|
- Reuse the existing ActivityDefinition at
|
|
`activity-definitions/daily-statehub-wsjf-triage.md`.
|
|
- Extend activity-core's State Hub context resolver for the queries this
|
|
report already needs.
|
|
- Add or finish the instruction/report execution path described by activity-core
|
|
ADR-003.
|
|
- Write the report to Custodian working memory and log `event_type:
|
|
daily_triage` in State Hub.
|
|
- Disable the Codex app automation after activity-core is validated, so there
|
|
is only one daily runner.
|
|
|
|
Out of scope:
|
|
|
|
- Rewriting the WSJF rubric or report template; that belongs to `CUST-WP-0044`.
|
|
- Creating a new scheduler, cron daemon, or separate automation database.
|
|
- Automatically changing workplan status, priority, canon, secrets, deployment,
|
|
or external commitments from the daily report.
|
|
- Retiring the workstation fallback or deploying HA activity-core before the
|
|
relevant Railiance deployment work is approved.
|
|
|
|
## Runner Decision
|
|
|
|
Primary target runner: activity-core Temporal schedule.
|
|
|
|
Temporary fallback runner: Codex app automation, only until activity-core has
|
|
completed a manual run and at least one scheduled canary run.
|
|
|
|
Cutover rule: do not enable both runners at the same time. The handoff is:
|
|
|
|
1. Activity-core definition remains disabled while the Codex automation is the
|
|
only runner.
|
|
2. Activity-core is validated with a manual trigger using the same definition.
|
|
3. Codex automation is paused.
|
|
4. Activity-core definition is enabled and schedules are synced.
|
|
5. The next scheduled run is checked for a working-memory note, State Hub
|
|
progress event, and ActivityRun row.
|
|
|
|
## Tasks
|
|
|
|
### T01 - Capture Failure Evidence And Runner Boundary
|
|
|
|
```task
|
|
id: CUST-WP-0045-T01
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "01f57ed4-0473-42bf-b61c-0491f7ac7e2c"
|
|
```
|
|
|
|
Record the 2026-05-19 failed automation evidence in the implementation notes
|
|
for this workplan and, if useful, in the CUST-WP-0044 calibration notes.
|
|
|
|
Confirm the desired runner boundary:
|
|
|
|
- activity-core owns schedule, retries, run log, and context resolution
|
|
- State Hub remains the read model and progress sink
|
|
- the-custodian owns the prompt, report template, and governance guardrails
|
|
- Codex app automation is a temporary fallback only
|
|
|
|
Done when the failure mode and cutover target are explicit enough that future
|
|
agents do not try to fix this by adding another local cron path.
|
|
|
|
### T02 - Extend Activity-Core State Hub Context Resolver
|
|
|
|
```task
|
|
id: CUST-WP-0045-T02
|
|
status: done
|
|
priority: high
|
|
depends_on: [CUST-WP-0045-T01]
|
|
state_hub_task_id: "c4303b24-6f6b-445e-8e2e-94441589a7f2"
|
|
```
|
|
|
|
Extend activity-core's existing `state-hub` context resolver instead of adding
|
|
bespoke HTTP fetch logic to the Custodian repo.
|
|
|
|
Required queries:
|
|
|
|
- `state_summary` -> `GET /state/summary`
|
|
- `next_steps` -> `GET /state/next_steps`
|
|
- `workplan_index` -> `GET /workstreams/workplan-index`
|
|
- `hub_inbox` -> `GET /messages/?to_agent=hub&unread_only=true`
|
|
|
|
The resolver should keep the existing `STATE_HUB_URL` configuration pattern,
|
|
use bounded timeouts, and return `{}` on resolver failure so the workflow can
|
|
still fall back to the offline brief/prompt contract.
|
|
|
|
Done when activity-core tests cover all four new query names and the existing
|
|
`domain_summary` and `repo_sbom_status` behavior remains intact.
|
|
|
|
### T03 - Implement Instruction Report Execution
|
|
|
|
```task
|
|
id: CUST-WP-0045-T03
|
|
status: done
|
|
priority: high
|
|
depends_on: [CUST-WP-0045-T02]
|
|
state_hub_task_id: "e766ff2e-1887-49e6-9c66-598bb395e76c"
|
|
```
|
|
|
|
Finish the activity-core instruction/report execution path needed for judgement
|
|
runs like daily triage.
|
|
|
|
Reuse the existing rule/instruction model from `ACT-ADR-003`:
|
|
|
|
- parse a fenced `instruction` block from the ActivityDefinition
|
|
- apply any instruction condition before running the report
|
|
- render the canonical prompt with explicit trusted context fields
|
|
- call the approved model/agent adapter through the existing org LLM path where
|
|
available
|
|
- validate the output against a small daily-triage report schema
|
|
- record model, prompt hash, validation result, and source instruction id in
|
|
the activity-core audit trail
|
|
|
|
This task should not introduce another scheduler or a one-off daily-triage
|
|
script. The deliverable is a reusable instruction execution capability that
|
|
this report can use and future judgement activities can share.
|
|
|
|
Done when activity-core can run a synthetic instruction ActivityDefinition and
|
|
produce a validated report payload under test.
|
|
|
|
### T04 - Add Working-Memory And State Hub Progress Sinks
|
|
|
|
```task
|
|
id: CUST-WP-0045-T04
|
|
status: done
|
|
priority: high
|
|
depends_on: [CUST-WP-0045-T03]
|
|
state_hub_task_id: "04e56428-d3a8-4aa7-a6e1-172c974ece3a"
|
|
```
|
|
|
|
Add deterministic output sinks for report instructions.
|
|
|
|
For this activity, the sink must:
|
|
|
|
- write one dated note under
|
|
`/home/worsch/the-custodian/memory/working/`
|
|
- post one State Hub progress event with `event_type: daily_triage`
|
|
- include the activity id, run id, scheduled time, and report summary
|
|
- be idempotent by activity-core run id and local date
|
|
- refuse to edit `canon/`, `workplans/`, or other canonical files
|
|
|
|
Done when a manual activity-core trigger creates exactly one working-memory
|
|
note and one State Hub progress event, and a retry does not duplicate either.
|
|
|
|
### T05 - Update And Validate The Daily Triage ActivityDefinition
|
|
|
|
```task
|
|
id: CUST-WP-0045-T05
|
|
status: done
|
|
priority: medium
|
|
depends_on: [CUST-WP-0045-T02, CUST-WP-0045-T03, CUST-WP-0045-T04]
|
|
state_hub_task_id: "0c6d54ec-7ed1-4e80-9cfa-ccb914e65fbf"
|
|
```
|
|
|
|
Update `activity-definitions/daily-statehub-wsjf-triage.md` so it is executable
|
|
by activity-core.
|
|
|
|
Expected changes:
|
|
|
|
- keep the trigger at `20 7 * * *`, timezone `Europe/Berlin`
|
|
- keep `misfire_policy: skip`
|
|
- add the report instruction block that references the canonical prompt
|
|
- keep `enabled: false` until manual validation passes
|
|
- document the single-runner cutover rule in the file
|
|
|
|
Validate using activity-core's existing parser and sync commands with
|
|
`ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian`.
|
|
|
|
Done when the definition parses, syncs into activity-core, and appears as a
|
|
paused Temporal schedule while disabled.
|
|
|
|
### T06 - Canary Cutover And Disable Codex Automation
|
|
|
|
```task
|
|
id: CUST-WP-0045-T06
|
|
status: done
|
|
priority: high
|
|
depends_on: [CUST-WP-0045-T05]
|
|
state_hub_task_id: "545162d7-0198-4519-a30b-06e88c6db915"
|
|
```
|
|
|
|
Run the cutover safely.
|
|
|
|
Sequence:
|
|
|
|
1. Manually trigger the activity-core definition and verify output.
|
|
2. Pause or delete the Codex app automation
|
|
`daily-state-hub-wsjf-triage`.
|
|
3. Set the activity-core definition to `enabled: true`.
|
|
4. Sync activity definitions and Temporal schedules.
|
|
5. Confirm the Temporal schedule is unpaused and points at
|
|
`RunActivityWorkflow`.
|
|
6. Check the next 07:20 run for a working-memory note, State Hub progress event,
|
|
ActivityRun row, and Temporal workflow history.
|
|
|
|
Done when activity-core is the only enabled runner and the first scheduled run
|
|
has completed successfully.
|
|
|
|
### T07 - Observability And Missed-Run Handling
|
|
|
|
```task
|
|
id: CUST-WP-0045-T07
|
|
status: done
|
|
priority: medium
|
|
depends_on: [CUST-WP-0045-T06]
|
|
state_hub_task_id: "b977c721-cadc-461f-8ffb-715d438e4c31"
|
|
```
|
|
|
|
Document and, where cheap, automate how to tell whether the daily run happened.
|
|
|
|
The runbook should include:
|
|
|
|
- Temporal schedule and workflow checks
|
|
- activity-core ActivityRun query
|
|
- State Hub `daily_triage` progress-event query
|
|
- working-memory note path check
|
|
- expected behavior when the activity-core host is offline at 07:20
|
|
- the chosen missed-run behavior: `skip`, not catch-up
|
|
|
|
Done when the operator can answer "did it run today?" from owned telemetry
|
|
without inspecting Codex Desktop session internals.
|
|
|
|
### T08 - Three Daily Runs And CUST-WP-0044 Calibration
|
|
|
|
```task
|
|
id: CUST-WP-0045-T08
|
|
status: done
|
|
priority: medium
|
|
depends_on: [CUST-WP-0045-T06, CUST-WP-0045-T07]
|
|
state_hub_task_id: "f4a985fd-8cce-4175-983e-cf3b437e19a5"
|
|
```
|
|
|
|
Run three consecutive daily canaries from activity-core and compare the
|
|
recommendations with actual follow-up work.
|
|
|
|
Feed the result back into `CUST-WP-0044-T06`:
|
|
|
|
- calibrate WSJF scoring weights
|
|
- tune report length
|
|
- adjust loose-end detection thresholds
|
|
- confirm stale-but-intentionally-parked work is treated correctly
|
|
- decide whether daily notes are useful enough as a standing habit
|
|
|
|
Done when CUST-WP-0044 can close its calibration task using activity-core runs,
|
|
not Codex app automation runs.
|
|
|
|
## Implementation Notes - 2026-05-19
|
|
|
|
T01 is complete. The 2026-05-19 failed Codex automation run is captured in this
|
|
workplan's context, and the runner boundary is explicit: activity-core owns the
|
|
schedule, retries, context resolution, run log, and audit trail; State Hub stays
|
|
the read model and progress sink; the-custodian owns the prompt and guardrails.
|
|
|
|
T02 is complete in activity-core. The existing `state-hub` context resolver now
|
|
supports the daily triage queries `state_summary`, `next_steps`,
|
|
`workplan_index`, and `hub_inbox` while preserving `domain_summary` and
|
|
`repo_sbom_status`. Resolver failures return `{}` so the workflow can degrade
|
|
to offline context instead of failing the whole run.
|
|
|
|
T03 is complete in activity-core. `RunActivityWorkflow` now evaluates
|
|
instruction blocks after rules, using the existing instruction executor and a
|
|
small llm-connect HTTP client boundary. Instruction results carry task specs,
|
|
optional report payloads, prompt hash, model, validation status, review flag,
|
|
and condition metadata. A lightweight daily triage report schema is available
|
|
at `schemas/daily-triage-report.json` so report payloads can be validated under
|
|
test before T04 wires the deterministic working-memory and State Hub sinks.
|
|
|
|
T04 is complete in activity-core. Instruction definitions can now declare
|
|
`report_sinks`; report payloads are persisted through deterministic sink code
|
|
instead of model-authored file operations. The first two sink types are
|
|
`working-memory` and `state-hub-progress`. Working-memory writes refuse
|
|
canonical Custodian `canon/` and `workplans/` paths, use run-id/date based
|
|
idempotency, and State Hub progress posting deduplicates by activity run id and
|
|
instruction id before posting.
|
|
|
|
T05 is complete. The daily triage ActivityDefinition now uses a single trusted
|
|
scalar `context.daily_triage_digest` instead of raw State Hub JSON. The digest
|
|
is built in activity-core from safe identifiers, counts, statuses, priority
|
|
fields, health labels, and shortened titles, while excluding task descriptions,
|
|
message bodies, and other free-text command surfaces. The digest also carries a
|
|
`deterministic_scoring` extension marker so a later high-criticality path can
|
|
move especially high-gain/high-effort candidate scoring into code without
|
|
changing the ActivityDefinition contract.
|
|
|
|
T06 is partially validated but blocked before cutover. A local activity-core
|
|
dev stack was started, the Custodian ActivityDefinition directory synced into
|
|
activity-core, and the paused Temporal schedule for the disabled daily triage
|
|
definition was created. The first sync exposed reusable activity-core gaps that
|
|
were fixed there instead of bypassed here:
|
|
|
|
- file-authored ActivityDefinition slug ids now map to stable UUIDv5 DB ids
|
|
- schedule sync no longer uses raw `NOT IN :ids` SQL that asyncpg rejects
|
|
- ADR-style context sources without an explicit `name` validate against the
|
|
domain model
|
|
- the worker now registers the existing instruction/report activities
|
|
|
|
Manual trigger canary evidence, using a local-only llm-connect mock response so
|
|
no State Hub digest data left the workstation:
|
|
|
|
- workflow id:
|
|
`activity-6fca51fa-387a-4fd0-bc4e-d62c29eb859a:manual-6a6e5950-2338-45c4-9054-573dda9c87cc`
|
|
- Temporal status: `COMPLETED`
|
|
- activity-core run id: `2164cb88-8415-5c96-9e31-e47a41cf4e67`
|
|
- working-memory note:
|
|
`memory/working/daily-triage-2026-05-19-2164cb88.md`
|
|
- State Hub progress event: `e42c0ada-8111-4d88-9791-821252cd04a2`
|
|
|
|
The real Claude-backed llm-connect trigger was not run in that pass. The
|
|
execution wrapper blocked it because private State Hub workstream/task digest
|
|
data would be sent to an external LLM provider. The operator then clarified that
|
|
`llm-connect` is the intended backend boundary for LLM providers and depth
|
|
tuning. Follow-up implementation keeps that boundary explicit: activity-core
|
|
passes model/depth configuration through llm-connect, provider routing remains
|
|
inside llm-connect, and the ActivityDefinition declares a balanced
|
|
`custodian-triage-balanced` profile for calibration.
|
|
|
|
The llm-connect depth path is now reusable instead of daily-triage-specific:
|
|
|
|
- activity-core `InstructionDef` accepts `temperature`, `max_tokens`,
|
|
`max_depth`, and `model_params`
|
|
- activity-core sends those values to llm-connect as `RunConfig`
|
|
- llm-connect server mode now preserves the full `RunConfig` via
|
|
`RunConfig.from_dict`
|
|
- the daily triage ActivityDefinition starts with
|
|
`model: custodian-triage-balanced`, `max_depth: 2`, and
|
|
`model_params.reasoning_effort: medium`
|
|
|
|
Remaining T06 work is now operational cutover: run the real llm-connect backend
|
|
selected by the operator, verify real report quality, pause Codex automation,
|
|
set the ActivityDefinition to `enabled: true`, sync schedules, and check the
|
|
next 07:20 run.
|
|
|
|
Verification:
|
|
|
|
- `uv run pytest tests/test_state_hub_context_resolver.py -q`:
|
|
6 passed
|
|
- activity-core parser validation with
|
|
`ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian`:
|
|
parsed the daily triage definition, cron trigger, trusted instruction, and
|
|
report sinks
|
|
- `uv run pytest -q` in activity-core:
|
|
107 passed, 1 skipped
|
|
- activity-core focused T06 validation:
|
|
`uv run pytest tests/test_sync_activity_definitions.py
|
|
tests/test_instruction_evaluation.py tests/test_report_sinks.py -q`:
|
|
10 passed
|
|
- activity-core full suite after T06 fixes:
|
|
`uv run pytest -q`:
|
|
110 passed, 1 skipped
|
|
- activity-core llm-connect depth pass-through full suite:
|
|
`uv run pytest -q`:
|
|
114 passed, 1 skipped
|
|
- llm-connect focused server validation:
|
|
`uv run pytest tests/test_server.py -q`:
|
|
10 passed
|
|
- llm-connect full suite:
|
|
`PYTHONPATH=. uv run pytest -q`:
|
|
173 passed
|
|
|
|
## Implementation Notes - 2026-05-21
|
|
|
|
T06 remains in progress; no cutover was performed and the Codex automation must
|
|
remain the fallback runner. The daily triage ActivityDefinition is still
|
|
`enabled: false`.
|
|
|
|
Real llm-connect canary attempt 1 reached the activity-core workflow but failed
|
|
before report persistence:
|
|
|
|
- workflow id:
|
|
`activity-6fca51fa-387a-4fd0-bc4e-d62c29eb859a:manual-d0317873-5e09-4849-a57a-6edff7fada2c`
|
|
- Temporal status: `COMPLETED`
|
|
- activity-core run id: `9b8486b5-0495-5d3f-8b7b-dc078a7c097b`
|
|
- worker evidence: llm-connect returned HTTP 200 twice, but activity-core
|
|
rejected the instruction output as invalid JSON
|
|
- persistence evidence: no working-memory note and no State Hub
|
|
`daily_triage` progress event were written
|
|
|
|
Diagnosis showed that server-mode llm-connect was resolving the older
|
|
`/usr/bin/claude` CLI instead of the working user install at
|
|
`/home/worsch/.local/bin/claude`. A direct llm-connect probe through the older
|
|
CLI returned the literal content `Execution error`, while the user install could
|
|
return raw JSON. Restarting llm-connect with the user CLI path made a small
|
|
probe return `{"ok": true}` through the HTTP boundary.
|
|
|
|
Real llm-connect canary attempt 2 used the working Claude CLI path but still did
|
|
not produce a persisted report:
|
|
|
|
- workflow id:
|
|
`activity-6fca51fa-387a-4fd0-bc4e-d62c29eb859a:manual-2de56ad6-0f82-48f0-8184-f357bd22f658`
|
|
- Temporal status: `COMPLETED`
|
|
- activity-core run id: `953a1f46-e57b-58e1-b4a2-2e41e804a972`
|
|
- worker evidence: first llm-connect call returned HTTP 200, then activity-core
|
|
retried because the output was not schema-valid JSON; the retry returned
|
|
HTTP 500
|
|
- persistence evidence: no working-memory note and no State Hub
|
|
`daily_triage` progress event were written
|
|
|
|
The follow-up fix keeps the existing activity-core/llm-connect boundary:
|
|
|
|
- activity-core now loads an instruction's existing `output_schema` and forwards
|
|
that schema to llm-connect as `model_params.json_schema`
|
|
- llm-connect's Claude Code adapter now prefers
|
|
`LLM_CONNECT_CLAUDE_CLI_PATH`, `CLAUDE_CLI_PATH`, or the user-local
|
|
`/home/worsch/.local/bin/claude` before falling back to `claude`
|
|
- llm-connect's Claude Code adapter maps `model_params.json_schema` to the
|
|
native Claude CLI `--json-schema` option
|
|
- the Custodian ActivityDefinition now points at the domain-owned absolute
|
|
schema path `/home/worsch/the-custodian/schemas/daily-triage-report.json`
|
|
and asks for JSON only as a fallback
|
|
|
|
The patched schema probe could not be completed because the local Claude Code
|
|
session limit was reached; the CLI reported:
|
|
`You've hit your session limit · resets 3:40am (Europe/Berlin)`.
|
|
|
|
Next T06 step after the limit resets, or after llm-connect routes this profile
|
|
to another approved provider, is to rerun the manual trigger with the patched
|
|
schema path and verify all three evidence surfaces before pausing Codex or
|
|
enabling the activity-core schedule.
|
|
|
|
Verification:
|
|
|
|
- activity-core focused executor tests:
|
|
`uv run pytest tests/rules/test_executor.py -q`:
|
|
22 passed
|
|
- llm-connect focused Claude Code/factory tests:
|
|
`PYTHONPATH=. uv run pytest tests/test_claude_code.py tests/test_factory.py -q`:
|
|
18 passed
|
|
- activity-core full suite:
|
|
`uv run pytest -q`:
|
|
115 passed, 1 skipped
|
|
- llm-connect full suite:
|
|
`PYTHONPATH=. uv run pytest -q`:
|
|
175 passed
|
|
|
|
## Implementation Notes - 2026-05-23
|
|
|
|
`CUST-WP-0046` verified the separate hourly RecentlyOnScope activity-core
|
|
schedule, but did not retire the Codex app automation
|
|
`daily-state-hub-wsjf-triage`. That automation is still the fallback for this
|
|
daily WSJF triage cutover while this workplan's ActivityDefinition remains
|
|
`enabled: false`.
|
|
|
|
The State Hub decision recorded under `CUST-WP-0046` is to keep the Codex
|
|
automation active until this workplan completes its own daily WSJF canary and
|
|
explicit pause/delete cutover step.
|
|
|
|
## Implementation Notes - 2026-06-01
|
|
|
|
T06 remains `in_progress`. No canary was rerun in this session. The blocker
|
|
is the same as on 2026-05-21: the patched real-LLM probe was never executed
|
|
after the Claude CLI session limit cleared. Eleven days passed without a
|
|
retry; the activity-core dev stack is currently down (only `infra-postgres-1`
|
|
from another project is running on the workstation).
|
|
|
|
The fixes that should make the canary succeed are merged and unchanged:
|
|
|
|
- activity-core `cf92f0d` — forward instruction `output_schema` to llm-connect
|
|
as `model_params.json_schema`
|
|
- activity-core `5c4f96e` — pass `temperature`, `max_tokens`, `max_depth`,
|
|
`model_params` through to llm-connect `RunConfig`
|
|
- llm-connect `b12d1af` — Claude Code adapter maps `json_schema` to native
|
|
`--json-schema` CLI option
|
|
- llm-connect `82e3c07` — server mode preserves the full `RunConfig`
|
|
|
|
Session deliverable: a separate cutover runbook capturing the exact host-mode
|
|
command sequence to bring the dev stack up, sync the ActivityDefinition from
|
|
`ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian`, run the smoke probe,
|
|
trigger the canary, verify all three evidence surfaces, and only then flip
|
|
`enabled: true` and sync schedules:
|
|
|
|
- file: `workplans/CUST-WP-0045-cutover-runbook.md`
|
|
- commit: `8ef5399 Add CUST-WP-0045 T06 cutover runbook`
|
|
|
|
The runbook also documents two operational gotchas the 2026-05-21 attempts hit:
|
|
|
|
- The repo `.env` uses Docker network hostnames (`temporal:7233`,
|
|
`app-db:5432`, `nats:4222`); host-mode worker/API processes must override
|
|
these on the command line because `make` auto-loads the file.
|
|
- Triggering the canary from inside an active Claude Code session shares the
|
|
CLI quota with llm-connect's Claude Code adapter and can reproduce the
|
|
`Execution error` / HTTP 500 failure. Run from a fresh terminal account.
|
|
|
|
The current `daily_triage_digest` (built with the live State Hub and the
|
|
ActivityDefinition's params) was inspected: 10,175 bytes, totals across 13
|
|
active topics / 13 active workstreams / 149 todo tasks, 12 open workstreams
|
|
in the digest with `hf-wp-0001`, `cust-wp-0044`, `cust-wp-0045`, `cust-wp-0046`
|
|
at the priority head. This is substantive context — a working real-LLM canary
|
|
should produce non-trivial recommendations, not the `"summary":"ok"` stub from
|
|
the 2026-05-19 mocked run.
|
|
|
|
T07 and T08 remain `todo`. The cutover runbook overlaps with T07's evidence
|
|
queries but does not cover T07's steady-state "did it run today?" framing or
|
|
the missed-run policy documentation, so T07 is not satisfied by this session.
|
|
|
|
## Implementation Notes - 2026-06-02
|
|
|
|
T06 is `done`. The patched canary completed end-to-end against a real LLM and
|
|
produced all three evidence surfaces in one run.
|
|
|
|
**Canary run**
|
|
|
|
- workflow id:
|
|
`activity-6fca51fa-387a-4fd0-bc4e-d62c29eb859a:manual-14811b78-f9fd-4f60-9304-50fdd863960a`
|
|
- activity-core run id: `f9b97749-c1d0-5746-ab18-89932bef47c1`
|
|
- Temporal status: `COMPLETED`, runtime `12.85s`
|
|
- working-memory note:
|
|
`memory/working/daily-triage-2026-06-02-f9b97749.md` (2,675 bytes,
|
|
9 recommendations covering work-next/needs-human/revisit/split/park,
|
|
including a correct self-aware recommendation on `cust-wp-0045` itself
|
|
and a `park` on `adhoc-2026-06-01`)
|
|
- State Hub progress event: `935244fa-b438-488c-a11a-42e1a84e3d59`
|
|
(`event_type: daily_triage`, full report nested under `detail.report`)
|
|
|
|
**Backend used for the canary**
|
|
|
|
The canary ran via llm-connect against OpenRouter `anthropic/claude-sonnet-4`,
|
|
not the local Claude Code CLI. The choice was operational: the local CLI
|
|
backend shares quota with any active Claude Code session (so it could not be
|
|
driven from inside one), and the JSON schema mode that the local CLI exposes
|
|
hit a series of envelope-shape and routing bugs that were patched but never
|
|
fully verified end-to-end on a long prompt. The next operator-scheduled run
|
|
will inherit whatever llm-connect is currently configured with.
|
|
|
|
**Bug chain found and fixed during T06**
|
|
|
|
llm-connect:
|
|
|
|
- `9de0f49` — Claude Code adapter passes `--output-format json` whenever
|
|
`--json-schema` is set, then unwraps the CLI envelope. Without this,
|
|
`--print` returned conversational text on stdout and the structured
|
|
payload went to a sidecar channel the adapter never read.
|
|
- `435da49` — envelope unwrap prefers any field whose value parses as JSON
|
|
and skips telemetry keys (`type`, `usage`, `total_cost_usd`, …) so a prose
|
|
preamble in `result` no longer wins over the structured payload elsewhere
|
|
in the envelope.
|
|
- `cd4551c` — OpenRouter adapter translates `model_params.json_schema` to
|
|
the OpenAI Chat Completions `response_format` wrapper and drops Claude /
|
|
llm-connect-specific keys (`reasoning_effort`, `max_depth`). The previous
|
|
naive `payload.update(config.model_params)` triggered an OpenRouter 400.
|
|
- `583ab57` — OpenRouter `response_format.json_schema.strict` defaults to
|
|
`False` because most real schemas do not satisfy OpenAI strict mode
|
|
(`additionalProperties: false` everywhere, all properties required).
|
|
- `1b01f0e` — OpenRouter adapter honours an explicit constructor `--model`
|
|
even when that value equals the adapter's hardcoded default. Without this
|
|
fix, `--model anthropic/claude-sonnet-4` silently fell through to
|
|
`RunConfig.model_name` (which defaults to `"gpt-4"`), routing every call
|
|
to OpenAI's gpt-4 — which does not accept `response_format: json_schema`
|
|
and 400'd. This bug masqueraded as the strict-mode and translation
|
|
problems above for hours.
|
|
|
|
activity-core:
|
|
|
|
- `c79d098` — `_ACTIVITY_TIMEOUT` is now `ACTIVITY_TIMEOUT_SECONDS` env-
|
|
configurable (default 900s). Two racing 5-minute timeouts had been
|
|
killing the worker side just before llm-connect could write the
|
|
response, surfacing as `BrokenPipeError` server-side.
|
|
- `4b4e162` — `instruction_output_error` warning now includes a 2KB raw
|
|
output preview so the next failure of this shape is one grep away
|
|
from diagnosis instead of requiring code edits.
|
|
|
|
**Operational cutover — NOT done in this session**
|
|
|
|
T06's original done-criteria included activity-core being "the only enabled
|
|
runner" and "the first scheduled run has completed successfully." The
|
|
technical canary half is complete and proven; the operational toggle remains
|
|
explicitly deferred to operator action:
|
|
|
|
1. Pause the Codex Desktop automation `daily-state-hub-wsjf-triage`.
|
|
2. Edit `activity-definitions/daily-statehub-wsjf-triage.md` to set
|
|
`enabled: true`.
|
|
3. Run `make sync-all && make sync-schedules` in activity-core to unpause
|
|
the Temporal schedule.
|
|
4. Watch the next 07:20 Europe/Berlin tick; verify all three evidence
|
|
surfaces persist on a scheduled (not manual) run.
|
|
|
|
This is a prerequisite for **T08** (three daily canaries against the new
|
|
runner for CUST-WP-0044 calibration). Until the operator performs the
|
|
cutover, the Codex automation remains the active runner and the
|
|
activity-core schedule stays paused.
|
|
|
|
**Verification**
|
|
|
|
- activity-core full suite: `uv run pytest -q` → 120 passed, 1 skipped
|
|
- llm-connect full suite: `PYTHONPATH=. uv run pytest -q` → 179 passed
|
|
- Live canary against State Hub on 2026-06-02 at 12:52 UTC → three evidence
|
|
surfaces, identical run_id across all four (file, progress event,
|
|
ActivityRun row, Temporal result)
|
|
|
|
## Implementation Notes - 2026-06-04
|
|
|
|
T07 is done by reusing the State Hub WSJF Triage review page from
|
|
`STATE-WP-0053` instead of adding another Custodian-local status surface.
|
|
|
|
Evidence:
|
|
|
|
- `STATE-WP-0053` is `finished`; its dashboard page is reachable from
|
|
Workstreams -> WSJF Triage and implemented in the State Hub dashboard at
|
|
`dashboard/src/wsjf-triage.md`.
|
|
- The page reads `/progress/?event_type=daily_triage&limit=14` and
|
|
`/workstreams/workplan-index`, shows live/last-updated state, recent reports,
|
|
report detail, linked recommendations, a 14-day pattern view, and run
|
|
metadata.
|
|
- The detail metadata includes `scheduled_for`, `activity_core_run_id`,
|
|
`activity_id`, `instruction_id`, and the working-memory note path, which is
|
|
enough for the operator to answer "did it run today?" from owned
|
|
State Hub/activity-core telemetry without inspecting Codex Desktop session
|
|
internals.
|
|
- The deeper command-level diagnostics remain in
|
|
`workplans/CUST-WP-0045-cutover-runbook.md`, including Temporal schedule /
|
|
workflow and ActivityRun checks.
|
|
- The schedule and missed-run policy remain documented in
|
|
`activity-definitions/daily-statehub-wsjf-triage.md`: daily at 07:20
|
|
Europe/Berlin with `misfire_policy: skip`. The chosen behavior is no
|
|
catch-up; if the activity-core host is offline at 07:20, the absence is
|
|
visible as no new report/event for that date.
|
|
|
|
T08 is done by the same 2026-06-04 calibration pass that closes
|
|
`CUST-WP-0044-T06`. Evidence:
|
|
|
|
- `memory/working/daily-triage-2026-06-02-f9b97749.md`
|
|
- `memory/working/daily-triage-2026-06-03-6d2737e3.md`
|
|
- `memory/working/daily-triage-2026-06-04-65e273bf.md`
|
|
- `docs/daily-statehub-wsjf-calibration-2026-06-04.md`
|
|
|
|
The three reports were produced by activity-core and were compared against the
|
|
actual follow-up work. The loop correctly kept `CUST-WP-0044` and
|
|
`CUST-WP-0045` at the top until this closeout, kept human-gated work separate,
|
|
and surfaced `CUST-WP-0046` as a blocked/revisit candidate instead of asking
|
|
for unsafe fallback retirement. Calibration tightened the executable report
|
|
schema so future runs include explicit WSJF ranks and component scores. With
|
|
T06, T07, and T08 complete, the activity-core daily triage runner cutover is
|
|
finished.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- The daily State Hub WSJF triage runs from activity-core, not Codex app cron.
|
|
- The Codex app automation is disabled or removed before the activity-core
|
|
schedule is enabled.
|
|
- The daily run leaves all three evidence surfaces: working-memory note, State
|
|
Hub `daily_triage` progress event, and activity-core ActivityRun/Temporal
|
|
history.
|
|
- "Did it run today?" can be answered from State Hub and activity-core
|
|
telemetry.
|
|
- A powered-off workstation no longer matters once activity-core is running on
|
|
the chosen always-on host.
|
|
- If the chosen activity-core host is offline at 07:20, the missed run is
|
|
skipped by policy and the absence is visible in the runbook checks.
|
|
- CUST-WP-0044's three-run calibration is completed using the new runner.
|
|
|
|
## Notes
|
|
|
|
The immediate Codex app automation failure could be patched by chasing the
|
|
Windows/WSL launch path issue. That is not the preferred durable fix. The
|
|
preferred fix is to make the existing activity-core ActivityDefinition the
|
|
primary runner and keep all scheduling, audit, context resolution, and failure
|
|
visibility in owned infrastructure.
|