16 KiB
id, type, title, domain, repo, status, owner, topic_slug, planning_priority, planning_order, created, updated, state_hub_workstream_id
| id | type | title | domain | repo | status | owner | topic_slug | planning_priority | planning_order | created | updated | state_hub_workstream_id |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CUST-WP-0045 | workplan | Activity-Core Daily Triage Runner Cutover | custodian | the-custodian | blocked | custodian | custodian | high | 45 | 2026-05-19 | 2026-05-19 | d9d9a3ec-f736-4041-beac-bb92c7ad314e |
CUST-WP-0045 - Activity-Core Daily Triage Runner Cutover
Goal
Move the Daily State Hub WSJF Triage runner from the Codex app automation substrate to owned activity-core infrastructure.
The outcome should be a reliable daily run at 07:20 Europe/Berlin that produces
the same review artifact promised by CUST-WP-0044: a dated working-memory
note, a State Hub daily_triage progress event, and an auditable activity-core
run record.
Context
On 2026-05-19 the Codex app automation fired at the scheduled time, but did not complete a useful run:
- two
Daily State Hub WSJF Triagesessions were created at 07:20 Europe/Berlin - both session files contained only session metadata
- no prompt execution, report, tool call, working-memory note, or final answer was recorded
- State Hub had no
daily_triageprogress event for that date - the recorded session cwd values used Windows-style
C:\home\worsch\...paths rather than the intended WSL paths
This shows the schedule is present but the launch substrate is not trustworthy enough for an unattended Custodian operating habit.
activity-core already provides the pieces that should own this class of work:
- Temporal cron schedules with timezone and misfire-policy handling
ActivityDefinitionmarkdown ingestion viaACTIVITY_DEFINITION_DIRSstate-hubcontext resolver hooks- ActivityRun logging and Temporal workflow history
- rule/instruction model design in
ACT-ADR-003 - deployment/runbook paths for the Railiance environment
The missing work is to connect those existing capabilities to this judgement report use case without building a second scheduler or a parallel priority database.
Scope
In scope:
- Extend activity-core so the existing daily triage ActivityDefinition can run as the primary scheduler.
- Reuse the existing prompt at
runtime/prompts/daily_statehub_wsgi_triage.md. - Reuse the existing ActivityDefinition at
activity-definitions/daily-statehub-wsjf-triage.md. - Extend activity-core's State Hub context resolver for the queries this report already needs.
- Add or finish the instruction/report execution path described by activity-core ADR-003.
- Write the report to Custodian working memory and log
event_type: daily_triagein State Hub. - Disable the Codex app automation after activity-core is validated, so there is only one daily runner.
Out of scope:
- Rewriting the WSJF rubric or report template; that belongs to
CUST-WP-0044. - Creating a new scheduler, cron daemon, or separate automation database.
- Automatically changing workplan status, priority, canon, secrets, deployment, or external commitments from the daily report.
- Retiring the workstation fallback or deploying HA activity-core before the relevant Railiance deployment work is approved.
Runner Decision
Primary target runner: activity-core Temporal schedule.
Temporary fallback runner: Codex app automation, only until activity-core has completed a manual run and at least one scheduled canary run.
Cutover rule: do not enable both runners at the same time. The handoff is:
- Activity-core definition remains disabled while the Codex automation is the only runner.
- Activity-core is validated with a manual trigger using the same definition.
- Codex automation is paused.
- Activity-core definition is enabled and schedules are synced.
- The next scheduled run is checked for a working-memory note, State Hub progress event, and ActivityRun row.
Tasks
T01 - Capture Failure Evidence And Runner Boundary
id: CUST-WP-0045-T01
status: done
priority: high
state_hub_task_id: "01f57ed4-0473-42bf-b61c-0491f7ac7e2c"
Record the 2026-05-19 failed automation evidence in the implementation notes for this workplan and, if useful, in the CUST-WP-0044 calibration notes.
Confirm the desired runner boundary:
- activity-core owns schedule, retries, run log, and context resolution
- State Hub remains the read model and progress sink
- the-custodian owns the prompt, report template, and governance guardrails
- Codex app automation is a temporary fallback only
Done when the failure mode and cutover target are explicit enough that future agents do not try to fix this by adding another local cron path.
T02 - Extend Activity-Core State Hub Context Resolver
id: CUST-WP-0045-T02
status: done
priority: high
depends_on: [CUST-WP-0045-T01]
state_hub_task_id: "c4303b24-6f6b-445e-8e2e-94441589a7f2"
Extend activity-core's existing state-hub context resolver instead of adding
bespoke HTTP fetch logic to the Custodian repo.
Required queries:
state_summary->GET /state/summarynext_steps->GET /state/next_stepsworkplan_index->GET /workstreams/workplan-indexhub_inbox->GET /messages/?to_agent=hub&unread_only=true
The resolver should keep the existing STATE_HUB_URL configuration pattern,
use bounded timeouts, and return {} on resolver failure so the workflow can
still fall back to the offline brief/prompt contract.
Done when activity-core tests cover all four new query names and the existing
domain_summary and repo_sbom_status behavior remains intact.
T03 - Implement Instruction Report Execution
id: CUST-WP-0045-T03
status: done
priority: high
depends_on: [CUST-WP-0045-T02]
state_hub_task_id: "e766ff2e-1887-49e6-9c66-598bb395e76c"
Finish the activity-core instruction/report execution path needed for judgement runs like daily triage.
Reuse the existing rule/instruction model from ACT-ADR-003:
- parse a fenced
instructionblock from the ActivityDefinition - apply any instruction condition before running the report
- render the canonical prompt with explicit trusted context fields
- call the approved model/agent adapter through the existing org LLM path where available
- validate the output against a small daily-triage report schema
- record model, prompt hash, validation result, and source instruction id in the activity-core audit trail
This task should not introduce another scheduler or a one-off daily-triage script. The deliverable is a reusable instruction execution capability that this report can use and future judgement activities can share.
Done when activity-core can run a synthetic instruction ActivityDefinition and produce a validated report payload under test.
T04 - Add Working-Memory And State Hub Progress Sinks
id: CUST-WP-0045-T04
status: done
priority: high
depends_on: [CUST-WP-0045-T03]
state_hub_task_id: "04e56428-d3a8-4aa7-a6e1-172c974ece3a"
Add deterministic output sinks for report instructions.
For this activity, the sink must:
- write one dated note under
/home/worsch/the-custodian/memory/working/ - post one State Hub progress event with
event_type: daily_triage - include the activity id, run id, scheduled time, and report summary
- be idempotent by activity-core run id and local date
- refuse to edit
canon/,workplans/, or other canonical files
Done when a manual activity-core trigger creates exactly one working-memory note and one State Hub progress event, and a retry does not duplicate either.
T05 - Update And Validate The Daily Triage ActivityDefinition
id: CUST-WP-0045-T05
status: done
priority: medium
depends_on: [CUST-WP-0045-T02, CUST-WP-0045-T03, CUST-WP-0045-T04]
state_hub_task_id: "0c6d54ec-7ed1-4e80-9cfa-ccb914e65fbf"
Update activity-definitions/daily-statehub-wsjf-triage.md so it is executable
by activity-core.
Expected changes:
- keep the trigger at
20 7 * * *, timezoneEurope/Berlin - keep
misfire_policy: skip - add the report instruction block that references the canonical prompt
- keep
enabled: falseuntil manual validation passes - document the single-runner cutover rule in the file
Validate using activity-core's existing parser and sync commands with
ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian.
Done when the definition parses, syncs into activity-core, and appears as a paused Temporal schedule while disabled.
T06 - Canary Cutover And Disable Codex Automation
id: CUST-WP-0045-T06
status: blocked
priority: high
depends_on: [CUST-WP-0045-T05]
state_hub_task_id: "545162d7-0198-4519-a30b-06e88c6db915"
blocking_reason: "Needs an approved non-external LLM path for private State Hub digest data, or explicit operator approval for the external llm-connect backend."
needs_human: true
intervention_note: "Real cutover needs an approved non-external LLM path for private State Hub digest data, or explicit human approval for the external llm-connect backend after review."
Run the cutover safely.
Sequence:
- Manually trigger the activity-core definition and verify output.
- Pause or delete the Codex app automation
daily-state-hub-wsjf-triage. - Set the activity-core definition to
enabled: true. - Sync activity definitions and Temporal schedules.
- Confirm the Temporal schedule is unpaused and points at
RunActivityWorkflow. - Check the next 07:20 run for a working-memory note, State Hub progress event, ActivityRun row, and Temporal workflow history.
Done when activity-core is the only enabled runner and the first scheduled run has completed successfully.
T07 - Observability And Missed-Run Handling
id: CUST-WP-0045-T07
status: todo
priority: medium
depends_on: [CUST-WP-0045-T06]
state_hub_task_id: "b977c721-cadc-461f-8ffb-715d438e4c31"
Document and, where cheap, automate how to tell whether the daily run happened.
The runbook should include:
- Temporal schedule and workflow checks
- activity-core ActivityRun query
- State Hub
daily_triageprogress-event query - working-memory note path check
- expected behavior when the activity-core host is offline at 07:20
- the chosen missed-run behavior:
skip, not catch-up
Done when the operator can answer "did it run today?" from owned telemetry without inspecting Codex Desktop session internals.
T08 - Three Daily Runs And CUST-WP-0044 Calibration
id: CUST-WP-0045-T08
status: todo
priority: medium
depends_on: [CUST-WP-0045-T06, CUST-WP-0045-T07]
state_hub_task_id: "f4a985fd-8cce-4175-983e-cf3b437e19a5"
Run three consecutive daily canaries from activity-core and compare the recommendations with actual follow-up work.
Feed the result back into CUST-WP-0044-T06:
- calibrate WSJF scoring weights
- tune report length
- adjust loose-end detection thresholds
- confirm stale-but-intentionally-parked work is treated correctly
- decide whether daily notes are useful enough as a standing habit
Done when CUST-WP-0044 can close its calibration task using activity-core runs, not Codex app automation runs.
Implementation Notes - 2026-05-19
T01 is complete. The 2026-05-19 failed Codex automation run is captured in this workplan's context, and the runner boundary is explicit: activity-core owns the schedule, retries, context resolution, run log, and audit trail; State Hub stays the read model and progress sink; the-custodian owns the prompt and guardrails.
T02 is complete in activity-core. The existing state-hub context resolver now
supports the daily triage queries state_summary, next_steps,
workplan_index, and hub_inbox while preserving domain_summary and
repo_sbom_status. Resolver failures return {} so the workflow can degrade
to offline context instead of failing the whole run.
T03 is complete in activity-core. RunActivityWorkflow now evaluates
instruction blocks after rules, using the existing instruction executor and a
small llm-connect HTTP client boundary. Instruction results carry task specs,
optional report payloads, prompt hash, model, validation status, review flag,
and condition metadata. A lightweight daily triage report schema is available
at schemas/daily-triage-report.json so report payloads can be validated under
test before T04 wires the deterministic working-memory and State Hub sinks.
T04 is complete in activity-core. Instruction definitions can now declare
report_sinks; report payloads are persisted through deterministic sink code
instead of model-authored file operations. The first two sink types are
working-memory and state-hub-progress. Working-memory writes refuse
canonical Custodian canon/ and workplans/ paths, use run-id/date based
idempotency, and State Hub progress posting deduplicates by activity run id and
instruction id before posting.
T05 is complete. The daily triage ActivityDefinition now uses a single trusted
scalar context.daily_triage_digest instead of raw State Hub JSON. The digest
is built in activity-core from safe identifiers, counts, statuses, priority
fields, health labels, and shortened titles, while excluding task descriptions,
message bodies, and other free-text command surfaces. The digest also carries a
deterministic_scoring extension marker so a later high-criticality path can
move especially high-gain/high-effort candidate scoring into code without
changing the ActivityDefinition contract.
T06 is partially validated but blocked before cutover. A local activity-core dev stack was started, the Custodian ActivityDefinition directory synced into activity-core, and the paused Temporal schedule for the disabled daily triage definition was created. The first sync exposed reusable activity-core gaps that were fixed there instead of bypassed here:
- file-authored ActivityDefinition slug ids now map to stable UUIDv5 DB ids
- schedule sync no longer uses raw
NOT IN :idsSQL that asyncpg rejects - ADR-style context sources without an explicit
namevalidate against the domain model - the worker now registers the existing instruction/report activities
Manual trigger canary evidence, using a local-only llm-connect mock response so no State Hub digest data left the workstation:
- workflow id:
activity-6fca51fa-387a-4fd0-bc4e-d62c29eb859a:manual-6a6e5950-2338-45c4-9054-573dda9c87cc - Temporal status:
COMPLETED - activity-core run id:
2164cb88-8415-5c96-9e31-e47a41cf4e67 - working-memory note:
memory/working/daily-triage-2026-05-19-2164cb88.md - State Hub progress event:
e42c0ada-8111-4d88-9791-821252cd04a2
The real Claude-backed llm-connect trigger was not run. The execution wrapper
blocked it because private State Hub workstream/task digest data would be sent
to an external LLM provider. Therefore the Codex app automation remains the
only enabled runner, the ActivityDefinition remains enabled: false, and T06
is blocked until there is either an approved local/private LLM backend or an
explicit operator decision to allow that external data flow.
Verification:
uv run pytest tests/test_state_hub_context_resolver.py -q: 6 passed- activity-core parser validation with
ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian: parsed the daily triage definition, cron trigger, trusted instruction, and report sinks uv run pytest -qin activity-core: 107 passed, 1 skipped- activity-core focused T06 validation:
uv run pytest tests/test_sync_activity_definitions.py tests/test_instruction_evaluation.py tests/test_report_sinks.py -q: 10 passed - activity-core full suite after T06 fixes:
uv run pytest -q: 110 passed, 1 skipped
Acceptance Criteria
- The daily State Hub WSJF triage runs from activity-core, not Codex app cron.
- The Codex app automation is disabled or removed before the activity-core schedule is enabled.
- The daily run leaves all three evidence surfaces: working-memory note, State
Hub
daily_triageprogress event, and activity-core ActivityRun/Temporal history. - "Did it run today?" can be answered from State Hub and activity-core telemetry.
- A powered-off workstation no longer matters once activity-core is running on the chosen always-on host.
- If the chosen activity-core host is offline at 07:20, the missed run is skipped by policy and the absence is visible in the runbook checks.
- CUST-WP-0044's three-run calibration is completed using the new runner.
Notes
The immediate Codex app automation failure could be patched by chasing the Windows/WSL launch path issue. That is not the preferred durable fix. The preferred fix is to make the existing activity-core ActivityDefinition the primary runner and keep all scheduling, audit, context resolution, and failure visibility in owned infrastructure.