From 24d6f2d17803b0c6c8d9e8d80d16791e050816a7 Mon Sep 17 00:00:00 2001 From: tegwick Date: Sat, 23 May 2026 03:17:02 +0200 Subject: [PATCH] Activity core practicability trial stuff --- .../hourly-recently-on-scope.md | 15 ++--- ...-0045-activity-core-daily-triage-runner.md | 14 ++++- ...-hourly-recently-on-scope-activity-core.md | 59 ++++++++++++++----- 3 files changed, 65 insertions(+), 23 deletions(-) diff --git a/activity-definitions/hourly-recently-on-scope.md b/activity-definitions/hourly-recently-on-scope.md index ff80ccc..6fd2696 100644 --- a/activity-definitions/hourly-recently-on-scope.md +++ b/activity-definitions/hourly-recently-on-scope.md @@ -3,10 +3,10 @@ id: "d104348c-d792-4377-943c-70a31e81a9bc" name: "Hourly RecentlyOnScope Reports" type: activity-definition version: "1.0" -enabled: false +enabled: true owner: custodian governance: custodian -status: draft +status: active created: "2026-05-22" trigger: type: cron @@ -38,14 +38,15 @@ activity-core owns the hourly schedule and ActivityRun audit trail. ## Runner Status -This definition is intentionally `enabled: false` until the manual canary -passes. +This definition is enabled after a successful manual canary against +Railiance01 Temporal. Cutover boundary: -- Codex app automation remains a fallback until `CUST-WP-0046-T06`. -- This activity-core definition becomes the primary hourly reporting substrate - only after one manual run and one scheduled run leave expected evidence. +- Codex app automation remains a fallback only if `CUST-WP-0046-T06` records + an explicit operator reason. +- This activity-core definition is the primary hourly reporting substrate + after one manual run and one scheduled run leave expected evidence. - Do not run a Codex fallback and this activity-core hourly routine as parallel primary runners. diff --git a/workplans/CUST-WP-0045-activity-core-daily-triage-runner.md b/workplans/CUST-WP-0045-activity-core-daily-triage-runner.md index 98c191c..b933a2e 100644 --- a/workplans/CUST-WP-0045-activity-core-daily-triage-runner.md +++ b/workplans/CUST-WP-0045-activity-core-daily-triage-runner.md @@ -10,7 +10,7 @@ topic_slug: custodian planning_priority: high planning_order: 45 created: "2026-05-19" -updated: "2026-05-21" +updated: "2026-05-23" state_hub_workstream_id: "d9d9a3ec-f736-4041-beac-bb92c7ad314e" --- @@ -496,6 +496,18 @@ Verification: `PYTHONPATH=. uv run pytest -q`: 175 passed +## Implementation Notes - 2026-05-23 + +`CUST-WP-0046` verified the separate hourly RecentlyOnScope activity-core +schedule, but did not retire the Codex app automation +`daily-state-hub-wsjf-triage`. That automation is still the fallback for this +daily WSJF triage cutover while this workplan's ActivityDefinition remains +`enabled: false`. + +The State Hub decision recorded under `CUST-WP-0046` is to keep the Codex +automation active until this workplan completes its own daily WSJF canary and +explicit pause/delete cutover step. + ## Acceptance Criteria - The daily State Hub WSJF triage runs from activity-core, not Codex app cron. diff --git a/workplans/CUST-WP-0046-hourly-recently-on-scope-activity-core.md b/workplans/CUST-WP-0046-hourly-recently-on-scope-activity-core.md index 8b47ac8..f8aa528 100644 --- a/workplans/CUST-WP-0046-hourly-recently-on-scope-activity-core.md +++ b/workplans/CUST-WP-0046-hourly-recently-on-scope-activity-core.md @@ -4,13 +4,13 @@ type: workplan title: "Activity-Core Hourly RecentlyOnScope Reports" domain: custodian repo: the-custodian -status: active +status: blocked owner: codex topic_slug: custodian planning_priority: high planning_order: 46 created: "2026-05-22" -updated: "2026-05-22" +updated: "2026-05-23" state_hub_workstream_id: "671153ff-55bc-4ace-aa97-f322ca76ab3c" --- @@ -105,7 +105,7 @@ Preferred path: ```task id: CUST-WP-0046-T01 -status: in_progress +status: done priority: high state_hub_task_id: "99e16a50-9775-4520-b343-f52fda0b67ec" ``` @@ -188,7 +188,7 @@ endpoint and record the result under test. ```task id: CUST-WP-0046-T04 -status: in_progress +status: done priority: high depends_on: [CUST-WP-0046-T03] state_hub_task_id: "dcb20f5a-c446-48d6-b810-84de365c22fd" @@ -215,7 +215,7 @@ paused Temporal schedule while disabled. ```task id: CUST-WP-0046-T05 -status: in_progress +status: done priority: high depends_on: [CUST-WP-0046-T04] state_hub_task_id: "f5c0cf64-a8e9-4d8c-bd86-ca58cbf132c2" @@ -249,7 +249,7 @@ status: blocked priority: high depends_on: [CUST-WP-0046-T05] state_hub_task_id: "2a46a6c8-4d3e-4064-a935-c90ca0c76a6d" -blocking_reason: "Waiting for manual and scheduled activity-core canaries before retiring the Codex app fallback." +blocking_reason: "Manual and scheduled hourly RecentlyOnScope canaries passed, but daily-state-hub-wsjf-triage is the CUST-WP-0045 daily WSJF fallback. Do not pause or delete it under CUST-WP-0046 until the daily WSJF activity-core definition has its own canary and cutover decision." ``` Remove the Codex app automation fallback from the operating path. @@ -340,16 +340,45 @@ Local canary evidence: had no failed domains. Progress event: `1ff091c3-227c-4a85-b5e2-172ba45a2676`. -Remaining gates: +Railiance01 cutover evidence - 2026-05-23: -- T01 remains in progress until the deployed activity-core API/worker/Temporal - schedule path is checked from the operator host. -- T04 remains in progress until the definition is synced into the live - activity-core DB and appears as a disabled/paused Temporal schedule. -- T05 remains in progress until a real `RunActivityWorkflow` produces an - ActivityRun and one scheduled hourly canary completes. -- T06 is blocked until T05 passes; the Codex app fallback should not be paused - or deleted before then. +- Runtime substrate verified: activity-core API health returned healthy DB and + Temporal status, worker and API deployments rolled out, and Temporal CLI can + list schedules/workflows from the Railiance01 cluster. +- Live State Hub access now uses an in-cluster bridge service + `actcore-state-hub-bridge` because the State Hub tunnel on Railiance01 is + bound to node-local `127.0.0.1:18000`. The activity-core runtime config now + points `STATE_HUB_URL` at `http://actcore-state-hub-bridge:8000`. +- The Railiance01 manifest mounts + `actcore-external-activity-definitions/hourly-recently-on-scope.md` into the + sync, API, and worker pods. Sync logs show + `Hourly RecentlyOnScope Reports` upserted into the live activity-core DB. +- The first live manual attempts recorded `{}` because the worker image still + had the old resolver behavior and the runtime URL was pointed at the wrong + service. activity-core now validates that the hourly State Hub response + contains `generated`, `skipped`, and `failed` before accepting the run. +- Manual canary after image rollout: + workflow + `activity-d104348c-d792-4377-943c-70a31e81a9bc:manual-0be193ee-341d-441b-9bc7-e1696b96455b`, + ActivityRun `43070cac-2669-5e53-bb9e-88dc1000f3a1`, progress event + `96bf2f6b-38e3-4143-a49b-1b1275a9713a`, one generated `custodian` report, + ten skipped quiet domains, and no failed domains. +- Scheduled canary via Temporal `schedule trigger`: + workflow + `activity-d104348c-d792-4377-943c-70a31e81a9bc:${firstScheduledTime}-2026-05-23T00:20:30Z`, + ActivityRun `ef741a08-3e21-5945-ae21-fdf9b1968438`, progress event + `8d284579-5652-4c6e-8783-36fde75f8ed4`, no generated reports for the quiet + window, eleven skipped quiet domains, and no failed domains. +- The hourly Temporal schedule is unpaused with `misfire_policy: skip`; the + API reports the ActivityDefinition `enabled: true`. + +Remaining gate: + +- T06 remains blocked because `daily-state-hub-wsjf-triage` is a daily WSJF + triage fallback, not the hourly RecentlyOnScope runner. The daily + activity-core definition is still `enabled: false`, so pausing or deleting the + Codex app automation here would remove daily WSJF coverage before + `CUST-WP-0045` has its own canary. ## Acceptance Criteria