Files
the-custodian/workplans/CUST-WP-0046-hourly-recently-on-scope-activity-core.md
2026-06-05 13:11:41 +02:00

16 KiB

id, type, title, domain, repo, status, owner, topic_slug, planning_priority, planning_order, created, updated, state_hub_workstream_id
id type title domain repo status owner topic_slug planning_priority planning_order created updated state_hub_workstream_id
CUST-WP-0046 workplan Activity-Core Hourly RecentlyOnScope Reports custodian the-custodian finished codex custodian high 46 2026-05-22 2026-06-04 671153ff-55bc-4ace-aa97-f322ca76ab3c

CUST-WP-0046 - Activity-Core Hourly RecentlyOnScope Reports

Goal

Move routine reporting onto owned activity-core infrastructure by scheduling an hourly RecentlyOnScope generation run for every State Hub domain that was active in the last hour.

The immediate operational outcome is:

  • activity-core owns the hourly schedule and run telemetry
  • State Hub owns domain activity selection and RecentlyOnScope report generation
  • generated reports use the existing State Hub RecentlyOnScope API and report directory
  • Codex app automation fallback is retired after the activity-core path is verified

This work is the practical bridge from "Codex app fallback" to "owned activity-core operating habit" without introducing another cron mechanism.

Current Automation State - 2026-05-22

Current verified state:

  • Codex app automation daily-state-hub-wsjf-triage exists and is ACTIVE, scheduled daily at 07:20 Europe/Berlin.
  • Custodian ActivityDefinition activity-definitions/daily-statehub-wsjf-triage.md is still enabled: false.
  • The local machine currently has only State Hub Postgres running in Docker; the local activity-core dev stack is not running.
  • State Hub progress reports that activity-core was deployed to the Railiance01 K3s environment on 2026-05-22, with API health, worker schedule sync, Temporal, NATS, and tests verified.
  • State Hub already has deterministic RecentlyOnScope support from STATE-WP-0044:
    • POST /domains/{slug}/recently-on-scope/
    • GET /domains/{slug}/recently-on-scope/
    • GET /domains/{slug}/recently-on-scope/{report_id}
    • default report root reports/recently-on-scope/
    • default report range 1h
  • One current report artifact exists for custodian: reports/recently-on-scope/custodian/20260522T081157Z--1h.md.

Scope

In scope:

  • Add a State Hub batch/selection surface that can determine which active domains had activity in a given window and generate RecentlyOnScope reports for those domains.
  • Add an activity-core ActivityDefinition owned by this repo for an hourly RecentlyOnScope run.
  • Keep the scheduled run deterministic; no LLM is needed for the first version.
  • Record activity-core run evidence, State Hub progress, and generated report metadata for every hourly run.
  • Define retention/idempotency for hourly report artifacts.
  • Retire or pause the Codex app automation fallback after the activity-core routine is verified and the daily triage cutover decision is explicit.

Out of scope:

  • Replacing the existing RecentlyOnScope report template.
  • Rebuilding RecentlyOnScope generation inside activity-core.
  • Generating reports for inactive domains unless the operator explicitly asks for a full-domain sweep.
  • LLM summarization of RecentlyOnScope reports.
  • Automatically editing workplans, canon, or domain goals from the reports.

Design Direction

Preferred path:

  1. State Hub exposes a deterministic batch endpoint for RecentlyOnScope hourly runs. It owns active-domain detection because it already owns domains, topics, workstreams, tasks, decisions, progress events, and report storage.
  2. activity-core schedules an hourly ActivityDefinition that calls the State Hub batch endpoint. activity-core should not duplicate domain-selection SQL.
  3. The hourly run is idempotent by (window, domain_slug, range), reusing the existing report id format when possible.
  4. The batch endpoint returns report metadata and source counts. activity-core stores that as run audit context and posts a compact progress event.
  5. After at least one manual run and one scheduled hourly run produce expected evidence, the Codex app fallback can be paused or deleted.

Tasks

T01 - Confirm Runtime Substrate And Cutover Boundary

id: CUST-WP-0046-T01
status: done
priority: high
state_hub_task_id: "99e16a50-9775-4520-b343-f52fda0b67ec"

Confirm the activity-core production deployment on Railiance01 is usable as the primary scheduler before adding more automation.

Checks:

  • activity-core API health endpoint reachable from the operator path
  • worker running and connected to Temporal
  • Temporal UI or CLI can list schedules and recent workflows
  • State Hub URL and credentials/environment are available to activity-core
  • llm-connect is not required for this RecentlyOnScope path
  • current Codex automation fallback status is recorded before any change

Done when the hourly RecentlyOnScope rollout has a verified activity-core host and a written cutover boundary: Codex remains fallback until T06, then is paused/deleted.

T02 - Add State Hub Active-Domain Batch Generation

id: CUST-WP-0046-T02
status: done
priority: high
depends_on: [CUST-WP-0046-T01]
state_hub_task_id: "c5004c0b-a261-407c-9376-a33883e054bf"

Extend State Hub's existing RecentlyOnScope functionality instead of creating a parallel generator in activity-core.

Required behavior:

  • accept a range such as 1h, defaulting to 1h
  • compute the window in UTC
  • select active domains with at least one qualifying source in the window: progress events, decisions, updated workstreams, or updated tasks
  • optionally include domains with open human-intervention items when configured
  • generate one RecentlyOnScope report per active domain using the existing generate_report() service
  • return metadata for generated/skipped/failed domains
  • log one State Hub progress event with event_type recently_on_scope_hourly

Proposed endpoint:

  • POST /recently-on-scope/hourly

Done when State Hub tests cover active-domain detection, no-active-domain behavior, report idempotency, partial failures, and report metadata response.

T03 - Add activity-core State Hub Batch Invocation Capability

id: CUST-WP-0046-T03
status: done
priority: high
depends_on: [CUST-WP-0046-T02]
state_hub_task_id: "a33b379f-56ba-47d0-b73c-3e724b1c5d45"

Add the smallest reusable activity-core capability needed to invoke the State Hub batch endpoint from an ActivityDefinition.

Preferred implementation:

  • a deterministic State Hub action activity or sink, not an LLM instruction
  • bounded timeout and retry policy
  • response metadata persisted into the ActivityRun audit trail
  • clear failure behavior: failed batch run marks the Temporal workflow failed or emits a visible blocked/error progress event
  • no domain-selection logic inside activity-core

Done when a synthetic ActivityDefinition can call a mocked State Hub batch endpoint and record the result under test.

T04 - Create Hourly RecentlyOnScope ActivityDefinition

id: CUST-WP-0046-T04
status: done
priority: high
depends_on: [CUST-WP-0046-T03]
state_hub_task_id: "dcb20f5a-c446-48d6-b810-84de365c22fd"

Create a domain-owned ActivityDefinition in activity-definitions/hourly-recently-on-scope.md.

Expected definition:

  • trigger: cron expression for hourly execution, timezone Europe/Berlin
  • misfire policy: skip
  • enabled: false until manual canary passes
  • action: call State Hub RecentlyOnScope hourly batch endpoint with range 1h
  • report/audit sink: State Hub progress event and activity-core run metadata
  • no LLM model configuration
  • clear note that this is the first owned routine replacing Codex fallback habits

Done when the definition parses, syncs into activity-core, and appears as a paused Temporal schedule while disabled.

T05 - Manual Canary And Scheduled Canary

id: CUST-WP-0046-T05
status: done
priority: high
depends_on: [CUST-WP-0046-T04]
state_hub_task_id: "f5c0cf64-a8e9-4d8c-bd86-ca58cbf132c2"

Validate the hourly routine before enabling it as the standing habit.

Manual canary evidence:

  • manual trigger workflow id
  • ActivityRun row
  • State Hub progress event with event_type recently_on_scope_hourly
  • generated report paths for every active domain
  • no generated reports for inactive domains unless configured
  • no duplicate report when rerun for the same window

Scheduled canary evidence:

  • hourly schedule unpaused only after manual canary passes
  • next hourly run completes from activity-core
  • generated report metadata matches the active-domain window
  • missed-run behavior is confirmed as skip

Done when both manual and scheduled canaries leave complete evidence.

T06 - Retire Codex Automation Fallback

id: CUST-WP-0046-T06
status: done
priority: high
depends_on: [CUST-WP-0046-T05]
state_hub_task_id: "2a46a6c8-4d3e-4064-a935-c90ca0c76a6d"

Remove the Codex app automation fallback from the operating path.

Steps:

  • decide whether to pause or delete daily-state-hub-wsjf-triage
  • record the decision in State Hub
  • apply the Codex automation change
  • update CUST-WP-0045 notes so the daily triage handoff no longer points at an active Codex fallback
  • ensure there is exactly one active scheduled substrate for routine reports: activity-core

Done when Codex app automation no longer runs routine Custodian reporting and the owned activity-core schedule has verified evidence.

T07 - Observability, Runbook, And Retention

id: CUST-WP-0046-T07
status: done
priority: medium
depends_on: [CUST-WP-0046-T05]
state_hub_task_id: "01066ff8-f591-43f6-99e6-d2f61654e590"

Document how operators answer "did the hourly report run?" without opening Codex Desktop.

Runbook should include:

  • activity-core schedule check
  • Temporal workflow query
  • ActivityRun query
  • State Hub progress query for recently_on_scope_hourly
  • RecentlyOnScope report list endpoint
  • report directory and retention expectations
  • behavior when activity-core host is offline at the top of the hour

Done when the operator can verify hourly report health from State Hub and activity-core telemetry alone.

Implementation Evidence - 2026-05-22

Implemented pieces:

  • State Hub now exposes POST /recently-on-scope/hourly.
  • State Hub batch generation reuses the existing RecentlyOnScope collector, renderer, report id, and report directory.
  • The batch endpoint selects domains by qualifying activity in the requested window: progress events, decisions, updated workstreams, or updated tasks.
  • Domains with only registered repositories are skipped; domains with open human-intervention items can be included by setting include_attention: true.
  • The batch endpoint records one recently_on_scope_hourly progress event with generated, skipped, and failed domain metadata.
  • activity-core has a reusable State Hub resolver query recently_on_scope_hourly that POSTs to the batch endpoint.
  • activity-core context sources can now be marked required: true; required resolver failures fail the workflow instead of silently binding {}.
  • Custodian now owns activity-definitions/hourly-recently-on-scope.md, disabled until canary.
  • Operator verification notes live in docs/hourly-recently-on-scope-runbook.md.

Verification:

  • State Hub: /home/worsch/.local/bin/uv run pytest tests/test_recently_on_scope.py passed with 11 tests.
  • State Hub: /home/worsch/.local/bin/uv run pytest tests/test_recently_on_scope.py tests/test_mcp_smoke.py::TestAddProgressEvent passed with 14 tests.
  • activity-core: /home/worsch/.local/bin/uv run pytest tests/test_state_hub_context_resolver.py tests/test_sync_activity_definitions.py tests/test_schedule_lifecycle.py passed with 19 tests.
  • activity-core parser scan with ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian found Hourly RecentlyOnScope Reports.

Local canary evidence:

  • Direct State Hub batch canary generated reports for custodian and markitect, skipped 9 quiet active domains, and had no failed domains. Progress event: ff1e845f-df02-476e-a3f3-fb912e4f32a8.
  • Direct activity-core resolver canary invoked the same State Hub batch endpoint and generated reports for custodian and markitect, skipped 9 domains, and had no failed domains. Progress event: 1ff091c3-227c-4a85-b5e2-172ba45a2676.

Railiance01 cutover evidence - 2026-05-23:

  • Runtime substrate verified: activity-core API health returned healthy DB and Temporal status, worker and API deployments rolled out, and Temporal CLI can list schedules/workflows from the Railiance01 cluster.
  • Live State Hub access now uses an in-cluster bridge service actcore-state-hub-bridge because the State Hub tunnel on Railiance01 is bound to node-local 127.0.0.1:18000. The activity-core runtime config now points STATE_HUB_URL at http://actcore-state-hub-bridge:8000.
  • The Railiance01 manifest mounts actcore-external-activity-definitions/hourly-recently-on-scope.md into the sync, API, and worker pods. Sync logs show Hourly RecentlyOnScope Reports upserted into the live activity-core DB.
  • The first live manual attempts recorded {} because the worker image still had the old resolver behavior and the runtime URL was pointed at the wrong service. activity-core now validates that the hourly State Hub response contains generated, skipped, and failed before accepting the run.
  • Manual canary after image rollout: workflow activity-d104348c-d792-4377-943c-70a31e81a9bc:manual-0be193ee-341d-441b-9bc7-e1696b96455b, ActivityRun 43070cac-2669-5e53-bb9e-88dc1000f3a1, progress event 96bf2f6b-38e3-4143-a49b-1b1275a9713a, one generated custodian report, ten skipped quiet domains, and no failed domains.
  • Scheduled canary via Temporal schedule trigger: workflow activity-d104348c-d792-4377-943c-70a31e81a9bc:${firstScheduledTime}-2026-05-23T00:20:30Z, ActivityRun ef741a08-3e21-5945-ae21-fdf9b1968438, progress event 8d284579-5652-4c6e-8783-36fde75f8ed4, no generated reports for the quiet window, eleven skipped quiet domains, and no failed domains.
  • The hourly Temporal schedule is unpaused with misfire_policy: skip; the API reports the ActivityDefinition enabled: true.

Remaining gate:

  • T06 remains blocked because daily-state-hub-wsjf-triage is a daily WSJF triage fallback, not the hourly RecentlyOnScope runner. The daily activity-core definition is still enabled: false, so pausing or deleting the Codex app automation here would remove daily WSJF coverage before CUST-WP-0045 has its own canary.

This gate cleared on 2026-06-04 after CUST-WP-0045 finished its activity-core daily WSJF cutover and CUST-WP-0044 finished the three-run calibration.

Implementation Evidence - 2026-06-04

T06 is done.

  • Decision: keep the old Codex Desktop automation daily-state-hub-wsjf-triage paused rather than deleting it immediately. This preserves a recoverable fallback without allowing it to run as a second routine scheduler.
  • Evidence: local Codex automation metadata shows daily-state-hub-wsjf-triage with status = "PAUSED".
  • activity-definitions/daily-statehub-wsjf-triage.md is enabled: true and now names activity-core as the active owned runner.
  • CUST-WP-0045 is finished; its T08 calibration used the June 2-4 activity-core daily triage notes.
  • The 2026-06-04 calibration also updated the daily triage prompt/schema so future reports include explicit WSJF ranks and component scores.
  • Result: routine Custodian reporting no longer depends on an active Codex app automation fallback. Daily WSJF and hourly RecentlyOnScope now both have owned activity-core paths, so CUST-WP-0046 is finished.

Acceptance Criteria

  • Hourly RecentlyOnScope reports are generated by activity-core, not Codex app automation.
  • State Hub owns active-domain selection and report generation.
  • Activity-core owns scheduling, run history, and failure visibility.
  • The run uses existing RecentlyOnScope report storage and API surfaces.
  • A manual canary and a scheduled canary both pass.
  • Codex app automation fallback is paused or deleted after activity-core is verified.
  • No LLM provider is required for the hourly RecentlyOnScope routine.
  • Missed runs are skipped rather than replayed in a burst.

Notes

This work complements CUST-WP-0045. Daily WSJF triage still needs its own LLM-backed canary before that specific report can be enabled in activity-core, but RecentlyOnScope is deterministic and can become the first always-on activity-core reporting habit sooner.