424 lines
16 KiB
Markdown
424 lines
16 KiB
Markdown
---
|
|
id: CUST-WP-0046
|
|
type: workplan
|
|
title: "Activity-Core Hourly RecentlyOnScope Reports"
|
|
domain: custodian
|
|
repo: the-custodian
|
|
status: finished
|
|
owner: codex
|
|
topic_slug: custodian
|
|
planning_priority: high
|
|
planning_order: 46
|
|
created: "2026-05-22"
|
|
updated: "2026-06-04"
|
|
state_hub_workstream_id: "671153ff-55bc-4ace-aa97-f322ca76ab3c"
|
|
---
|
|
|
|
# CUST-WP-0046 - Activity-Core Hourly RecentlyOnScope Reports
|
|
|
|
## Goal
|
|
|
|
Move routine reporting onto owned activity-core infrastructure by scheduling an
|
|
hourly RecentlyOnScope generation run for every State Hub domain that was active
|
|
in the last hour.
|
|
|
|
The immediate operational outcome is:
|
|
|
|
- activity-core owns the hourly schedule and run telemetry
|
|
- State Hub owns domain activity selection and RecentlyOnScope report generation
|
|
- generated reports use the existing State Hub RecentlyOnScope API and report
|
|
directory
|
|
- Codex app automation fallback is retired after the activity-core path is
|
|
verified
|
|
|
|
This work is the practical bridge from "Codex app fallback" to "owned
|
|
activity-core operating habit" without introducing another cron mechanism.
|
|
|
|
## Current Automation State - 2026-05-22
|
|
|
|
Current verified state:
|
|
|
|
- Codex app automation `daily-state-hub-wsjf-triage` exists and is `ACTIVE`,
|
|
scheduled daily at 07:20 Europe/Berlin.
|
|
- Custodian ActivityDefinition
|
|
`activity-definitions/daily-statehub-wsjf-triage.md` is still
|
|
`enabled: false`.
|
|
- The local machine currently has only State Hub Postgres running in Docker;
|
|
the local activity-core dev stack is not running.
|
|
- State Hub progress reports that activity-core was deployed to the Railiance01
|
|
K3s environment on 2026-05-22, with API health, worker schedule sync, Temporal,
|
|
NATS, and tests verified.
|
|
- State Hub already has deterministic RecentlyOnScope support from
|
|
`STATE-WP-0044`:
|
|
- `POST /domains/{slug}/recently-on-scope/`
|
|
- `GET /domains/{slug}/recently-on-scope/`
|
|
- `GET /domains/{slug}/recently-on-scope/{report_id}`
|
|
- default report root `reports/recently-on-scope/`
|
|
- default report range `1h`
|
|
- One current report artifact exists for `custodian`:
|
|
`reports/recently-on-scope/custodian/20260522T081157Z--1h.md`.
|
|
|
|
## Scope
|
|
|
|
In scope:
|
|
|
|
- Add a State Hub batch/selection surface that can determine which active
|
|
domains had activity in a given window and generate RecentlyOnScope reports
|
|
for those domains.
|
|
- Add an activity-core ActivityDefinition owned by this repo for an hourly
|
|
RecentlyOnScope run.
|
|
- Keep the scheduled run deterministic; no LLM is needed for the first version.
|
|
- Record activity-core run evidence, State Hub progress, and generated report
|
|
metadata for every hourly run.
|
|
- Define retention/idempotency for hourly report artifacts.
|
|
- Retire or pause the Codex app automation fallback after the activity-core
|
|
routine is verified and the daily triage cutover decision is explicit.
|
|
|
|
Out of scope:
|
|
|
|
- Replacing the existing RecentlyOnScope report template.
|
|
- Rebuilding RecentlyOnScope generation inside activity-core.
|
|
- Generating reports for inactive domains unless the operator explicitly asks
|
|
for a full-domain sweep.
|
|
- LLM summarization of RecentlyOnScope reports.
|
|
- Automatically editing workplans, canon, or domain goals from the reports.
|
|
|
|
## Design Direction
|
|
|
|
Preferred path:
|
|
|
|
1. State Hub exposes a deterministic batch endpoint for RecentlyOnScope hourly
|
|
runs. It owns active-domain detection because it already owns domains,
|
|
topics, workstreams, tasks, decisions, progress events, and report storage.
|
|
2. activity-core schedules an hourly ActivityDefinition that calls the State Hub
|
|
batch endpoint. activity-core should not duplicate domain-selection SQL.
|
|
3. The hourly run is idempotent by `(window, domain_slug, range)`, reusing the
|
|
existing report id format when possible.
|
|
4. The batch endpoint returns report metadata and source counts. activity-core
|
|
stores that as run audit context and posts a compact progress event.
|
|
5. After at least one manual run and one scheduled hourly run produce expected
|
|
evidence, the Codex app fallback can be paused or deleted.
|
|
|
|
## Tasks
|
|
|
|
### T01 - Confirm Runtime Substrate And Cutover Boundary
|
|
|
|
```task
|
|
id: CUST-WP-0046-T01
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "99e16a50-9775-4520-b343-f52fda0b67ec"
|
|
```
|
|
|
|
Confirm the activity-core production deployment on Railiance01 is usable as the
|
|
primary scheduler before adding more automation.
|
|
|
|
Checks:
|
|
|
|
- activity-core API health endpoint reachable from the operator path
|
|
- worker running and connected to Temporal
|
|
- Temporal UI or CLI can list schedules and recent workflows
|
|
- State Hub URL and credentials/environment are available to activity-core
|
|
- llm-connect is not required for this RecentlyOnScope path
|
|
- current Codex automation fallback status is recorded before any change
|
|
|
|
Done when the hourly RecentlyOnScope rollout has a verified activity-core host
|
|
and a written cutover boundary: Codex remains fallback until T06, then is
|
|
paused/deleted.
|
|
|
|
### T02 - Add State Hub Active-Domain Batch Generation
|
|
|
|
```task
|
|
id: CUST-WP-0046-T02
|
|
status: done
|
|
priority: high
|
|
depends_on: [CUST-WP-0046-T01]
|
|
state_hub_task_id: "c5004c0b-a261-407c-9376-a33883e054bf"
|
|
```
|
|
|
|
Extend State Hub's existing RecentlyOnScope functionality instead of creating a
|
|
parallel generator in activity-core.
|
|
|
|
Required behavior:
|
|
|
|
- accept a range such as `1h`, defaulting to `1h`
|
|
- compute the window in UTC
|
|
- select active domains with at least one qualifying source in the window:
|
|
progress events, decisions, updated workstreams, or updated tasks
|
|
- optionally include domains with open human-intervention items when configured
|
|
- generate one RecentlyOnScope report per active domain using the existing
|
|
`generate_report()` service
|
|
- return metadata for generated/skipped/failed domains
|
|
- log one State Hub progress event with event_type
|
|
`recently_on_scope_hourly`
|
|
|
|
Proposed endpoint:
|
|
|
|
- `POST /recently-on-scope/hourly`
|
|
|
|
Done when State Hub tests cover active-domain detection, no-active-domain
|
|
behavior, report idempotency, partial failures, and report metadata response.
|
|
|
|
### T03 - Add activity-core State Hub Batch Invocation Capability
|
|
|
|
```task
|
|
id: CUST-WP-0046-T03
|
|
status: done
|
|
priority: high
|
|
depends_on: [CUST-WP-0046-T02]
|
|
state_hub_task_id: "a33b379f-56ba-47d0-b73c-3e724b1c5d45"
|
|
```
|
|
|
|
Add the smallest reusable activity-core capability needed to invoke the State
|
|
Hub batch endpoint from an ActivityDefinition.
|
|
|
|
Preferred implementation:
|
|
|
|
- a deterministic State Hub action activity or sink, not an LLM instruction
|
|
- bounded timeout and retry policy
|
|
- response metadata persisted into the ActivityRun audit trail
|
|
- clear failure behavior: failed batch run marks the Temporal workflow failed or
|
|
emits a visible blocked/error progress event
|
|
- no domain-selection logic inside activity-core
|
|
|
|
Done when a synthetic ActivityDefinition can call a mocked State Hub batch
|
|
endpoint and record the result under test.
|
|
|
|
### T04 - Create Hourly RecentlyOnScope ActivityDefinition
|
|
|
|
```task
|
|
id: CUST-WP-0046-T04
|
|
status: done
|
|
priority: high
|
|
depends_on: [CUST-WP-0046-T03]
|
|
state_hub_task_id: "dcb20f5a-c446-48d6-b810-84de365c22fd"
|
|
```
|
|
|
|
Create a domain-owned ActivityDefinition in
|
|
`activity-definitions/hourly-recently-on-scope.md`.
|
|
|
|
Expected definition:
|
|
|
|
- trigger: cron expression for hourly execution, timezone `Europe/Berlin`
|
|
- misfire policy: `skip`
|
|
- enabled: `false` until manual canary passes
|
|
- action: call State Hub RecentlyOnScope hourly batch endpoint with range `1h`
|
|
- report/audit sink: State Hub progress event and activity-core run metadata
|
|
- no LLM model configuration
|
|
- clear note that this is the first owned routine replacing Codex fallback
|
|
habits
|
|
|
|
Done when the definition parses, syncs into activity-core, and appears as a
|
|
paused Temporal schedule while disabled.
|
|
|
|
### T05 - Manual Canary And Scheduled Canary
|
|
|
|
```task
|
|
id: CUST-WP-0046-T05
|
|
status: done
|
|
priority: high
|
|
depends_on: [CUST-WP-0046-T04]
|
|
state_hub_task_id: "f5c0cf64-a8e9-4d8c-bd86-ca58cbf132c2"
|
|
```
|
|
|
|
Validate the hourly routine before enabling it as the standing habit.
|
|
|
|
Manual canary evidence:
|
|
|
|
- manual trigger workflow id
|
|
- ActivityRun row
|
|
- State Hub progress event with event_type `recently_on_scope_hourly`
|
|
- generated report paths for every active domain
|
|
- no generated reports for inactive domains unless configured
|
|
- no duplicate report when rerun for the same window
|
|
|
|
Scheduled canary evidence:
|
|
|
|
- hourly schedule unpaused only after manual canary passes
|
|
- next hourly run completes from activity-core
|
|
- generated report metadata matches the active-domain window
|
|
- missed-run behavior is confirmed as `skip`
|
|
|
|
Done when both manual and scheduled canaries leave complete evidence.
|
|
|
|
### T06 - Retire Codex Automation Fallback
|
|
|
|
```task
|
|
id: CUST-WP-0046-T06
|
|
status: done
|
|
priority: high
|
|
depends_on: [CUST-WP-0046-T05]
|
|
state_hub_task_id: "2a46a6c8-4d3e-4064-a935-c90ca0c76a6d"
|
|
```
|
|
|
|
Remove the Codex app automation fallback from the operating path.
|
|
|
|
Steps:
|
|
|
|
- decide whether to pause or delete `daily-state-hub-wsjf-triage`
|
|
- record the decision in State Hub
|
|
- apply the Codex automation change
|
|
- update CUST-WP-0045 notes so the daily triage handoff no longer points at an
|
|
active Codex fallback
|
|
- ensure there is exactly one active scheduled substrate for routine reports:
|
|
activity-core
|
|
|
|
Done when Codex app automation no longer runs routine Custodian reporting and
|
|
the owned activity-core schedule has verified evidence.
|
|
|
|
### T07 - Observability, Runbook, And Retention
|
|
|
|
```task
|
|
id: CUST-WP-0046-T07
|
|
status: done
|
|
priority: medium
|
|
depends_on: [CUST-WP-0046-T05]
|
|
state_hub_task_id: "01066ff8-f591-43f6-99e6-d2f61654e590"
|
|
```
|
|
|
|
Document how operators answer "did the hourly report run?" without opening
|
|
Codex Desktop.
|
|
|
|
Runbook should include:
|
|
|
|
- activity-core schedule check
|
|
- Temporal workflow query
|
|
- ActivityRun query
|
|
- State Hub progress query for `recently_on_scope_hourly`
|
|
- RecentlyOnScope report list endpoint
|
|
- report directory and retention expectations
|
|
- behavior when activity-core host is offline at the top of the hour
|
|
|
|
Done when the operator can verify hourly report health from State Hub and
|
|
activity-core telemetry alone.
|
|
|
|
## Implementation Evidence - 2026-05-22
|
|
|
|
Implemented pieces:
|
|
|
|
- State Hub now exposes `POST /recently-on-scope/hourly`.
|
|
- State Hub batch generation reuses the existing RecentlyOnScope collector,
|
|
renderer, report id, and report directory.
|
|
- The batch endpoint selects domains by qualifying activity in the requested
|
|
window: progress events, decisions, updated workstreams, or updated tasks.
|
|
- Domains with only registered repositories are skipped; domains with open
|
|
human-intervention items can be included by setting `include_attention: true`.
|
|
- The batch endpoint records one `recently_on_scope_hourly` progress event with
|
|
generated, skipped, and failed domain metadata.
|
|
- activity-core has a reusable State Hub resolver query
|
|
`recently_on_scope_hourly` that POSTs to the batch endpoint.
|
|
- activity-core context sources can now be marked `required: true`; required
|
|
resolver failures fail the workflow instead of silently binding `{}`.
|
|
- Custodian now owns
|
|
`activity-definitions/hourly-recently-on-scope.md`, disabled until canary.
|
|
- Operator verification notes live in
|
|
`docs/hourly-recently-on-scope-runbook.md`.
|
|
|
|
Verification:
|
|
|
|
- State Hub:
|
|
`/home/worsch/.local/bin/uv run pytest tests/test_recently_on_scope.py`
|
|
passed with 11 tests.
|
|
- State Hub:
|
|
`/home/worsch/.local/bin/uv run pytest tests/test_recently_on_scope.py tests/test_mcp_smoke.py::TestAddProgressEvent`
|
|
passed with 14 tests.
|
|
- activity-core:
|
|
`/home/worsch/.local/bin/uv run pytest tests/test_state_hub_context_resolver.py tests/test_sync_activity_definitions.py tests/test_schedule_lifecycle.py`
|
|
passed with 19 tests.
|
|
- activity-core parser scan with
|
|
`ACTIVITY_DEFINITION_DIRS=/home/worsch/the-custodian` found
|
|
`Hourly RecentlyOnScope Reports`.
|
|
|
|
Local canary evidence:
|
|
|
|
- Direct State Hub batch canary generated reports for `custodian` and
|
|
`markitect`, skipped 9 quiet active domains, and had no failed domains.
|
|
Progress event: `ff1e845f-df02-476e-a3f3-fb912e4f32a8`.
|
|
- Direct activity-core resolver canary invoked the same State Hub batch endpoint
|
|
and generated reports for `custodian` and `markitect`, skipped 9 domains, and
|
|
had no failed domains.
|
|
Progress event: `1ff091c3-227c-4a85-b5e2-172ba45a2676`.
|
|
|
|
Railiance01 cutover evidence - 2026-05-23:
|
|
|
|
- Runtime substrate verified: activity-core API health returned healthy DB and
|
|
Temporal status, worker and API deployments rolled out, and Temporal CLI can
|
|
list schedules/workflows from the Railiance01 cluster.
|
|
- Live State Hub access now uses an in-cluster bridge service
|
|
`actcore-state-hub-bridge` because the State Hub tunnel on Railiance01 is
|
|
bound to node-local `127.0.0.1:18000`. The activity-core runtime config now
|
|
points `STATE_HUB_URL` at `http://actcore-state-hub-bridge:8000`.
|
|
- The Railiance01 manifest mounts
|
|
`actcore-external-activity-definitions/hourly-recently-on-scope.md` into the
|
|
sync, API, and worker pods. Sync logs show
|
|
`Hourly RecentlyOnScope Reports` upserted into the live activity-core DB.
|
|
- The first live manual attempts recorded `{}` because the worker image still
|
|
had the old resolver behavior and the runtime URL was pointed at the wrong
|
|
service. activity-core now validates that the hourly State Hub response
|
|
contains `generated`, `skipped`, and `failed` before accepting the run.
|
|
- Manual canary after image rollout:
|
|
workflow
|
|
`activity-d104348c-d792-4377-943c-70a31e81a9bc:manual-0be193ee-341d-441b-9bc7-e1696b96455b`,
|
|
ActivityRun `43070cac-2669-5e53-bb9e-88dc1000f3a1`, progress event
|
|
`96bf2f6b-38e3-4143-a49b-1b1275a9713a`, one generated `custodian` report,
|
|
ten skipped quiet domains, and no failed domains.
|
|
- Scheduled canary via Temporal `schedule trigger`:
|
|
workflow
|
|
`activity-d104348c-d792-4377-943c-70a31e81a9bc:${firstScheduledTime}-2026-05-23T00:20:30Z`,
|
|
ActivityRun `ef741a08-3e21-5945-ae21-fdf9b1968438`, progress event
|
|
`8d284579-5652-4c6e-8783-36fde75f8ed4`, no generated reports for the quiet
|
|
window, eleven skipped quiet domains, and no failed domains.
|
|
- The hourly Temporal schedule is unpaused with `misfire_policy: skip`; the
|
|
API reports the ActivityDefinition `enabled: true`.
|
|
|
|
Remaining gate:
|
|
|
|
- T06 remains blocked because `daily-state-hub-wsjf-triage` is a daily WSJF
|
|
triage fallback, not the hourly RecentlyOnScope runner. The daily
|
|
activity-core definition is still `enabled: false`, so pausing or deleting the
|
|
Codex app automation here would remove daily WSJF coverage before
|
|
`CUST-WP-0045` has its own canary.
|
|
|
|
This gate cleared on 2026-06-04 after `CUST-WP-0045` finished its activity-core
|
|
daily WSJF cutover and `CUST-WP-0044` finished the three-run calibration.
|
|
|
|
## Implementation Evidence - 2026-06-04
|
|
|
|
T06 is done.
|
|
|
|
- Decision: keep the old Codex Desktop automation
|
|
`daily-state-hub-wsjf-triage` paused rather than deleting it immediately.
|
|
This preserves a recoverable fallback without allowing it to run as a second
|
|
routine scheduler.
|
|
- Evidence: local Codex automation metadata shows
|
|
`daily-state-hub-wsjf-triage` with `status = "PAUSED"`.
|
|
- `activity-definitions/daily-statehub-wsjf-triage.md` is `enabled: true` and
|
|
now names activity-core as the active owned runner.
|
|
- `CUST-WP-0045` is finished; its T08 calibration used the June 2-4
|
|
activity-core daily triage notes.
|
|
- The 2026-06-04 calibration also updated the daily triage prompt/schema so
|
|
future reports include explicit WSJF ranks and component scores.
|
|
- Result: routine Custodian reporting no longer depends on an active Codex app
|
|
automation fallback. Daily WSJF and hourly RecentlyOnScope now both have
|
|
owned activity-core paths, so `CUST-WP-0046` is finished.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- Hourly RecentlyOnScope reports are generated by activity-core, not Codex app
|
|
automation.
|
|
- State Hub owns active-domain selection and report generation.
|
|
- Activity-core owns scheduling, run history, and failure visibility.
|
|
- The run uses existing RecentlyOnScope report storage and API surfaces.
|
|
- A manual canary and a scheduled canary both pass.
|
|
- Codex app automation fallback is paused or deleted after activity-core is
|
|
verified.
|
|
- No LLM provider is required for the hourly RecentlyOnScope routine.
|
|
- Missed runs are skipped rather than replayed in a burst.
|
|
|
|
## Notes
|
|
|
|
This work complements `CUST-WP-0045`. Daily WSJF triage still needs its own
|
|
LLM-backed canary before that specific report can be enabled in activity-core,
|
|
but RecentlyOnScope is deterministic and can become the first always-on
|
|
activity-core reporting habit sooner.
|