Add automation status surface

This commit is contained in:
2026-07-01 20:12:04 +02:00
parent 3f85274916
commit ffe10f098e
20 changed files with 1732 additions and 11 deletions

View File

@@ -4,11 +4,11 @@ type: workplan
title: "Post-triage operational hardening"
domain: custodian
repo: activity-core
status: active
status: finished
owner: codex
topic_slug: custodian
created: "2026-06-03"
updated: "2026-06-27"
updated: "2026-06-30"
state_hub_workstream_id: "5646e13a-13af-4724-bca6-3c0d86f96733"
---
@@ -104,7 +104,7 @@ and emitted a validated `daily_triage` report plus working-memory note.
```task
id: ACTIVITY-WP-0006-T03
status: wait
status: done
priority: medium
state_hub_task_id: "7cbf0a35-71a1-47ac-afc2-f51ad2180fd0"
```
@@ -203,6 +203,27 @@ ACTIVITY-WP-0016 output-robustness bundle and runtime prompt/token changes, not
a missing schedule. T03 stays wait until a post-deployment smoke passes and three
new clean scheduled runs are collected.
2026-06-30 early checkpoint: two new clean scheduled runs exist after the
validation failures. State Hub daily_triage progress shows 2026-06-28
05:20:51Z run `6a44d6dd-3f02-53f2-a5d8-d42b76b0ef98` and 2026-06-29
05:20:49Z run `1dfb47c9-07bf-551b-b778-1d21a40bd95c`, both with
`output_validated=true` and working-memory notes written. The current local time
was 2026-06-30 01:37 Europe/Berlin, before the expected 07:20 Berlin scheduled
fire, so the three-clean-run gate cannot close yet. Recheck after 2026-06-30
05:20Z; if that scheduled run validates, the clean streak is 06-28 / 06-29 /
06-30 and T03 can close with calibration feedback.
2026-06-30 closeout: the 07:20 Berlin scheduled run fired at 05:20:50Z as run
`ac3d71a0-2f8f-50df-b3ce-7c60c2abb5c5` with `output_validated=true` and a
working-memory note written. The post-failure clean streak is now complete:
2026-06-28 (`6a44d6dd`), 2026-06-29 (`1dfb47c9`), and 2026-06-30 (`ac3d71a0`).
Calibration feedback: the scheduler, worker, llm-connect route, State Hub sink,
and working-memory sink are stable again; the recommendations were operationally
useful but too dense at 10 items, repeatedly emphasizing human-dependency and
infrastructure-unblock work. ACTIVITY-WP-0016 now owns the density/contract fix:
Railiance runtime projection was aligned to a top-7 contract so the next live
run can prove the bounded output posture. T03 is done.
## Rule Action Contract Documentation
```task

View File

@@ -8,7 +8,7 @@ status: active
owner: codex
topic_slug: custodian
created: "2026-06-26"
updated: "2026-06-27"
updated: "2026-06-30"
state_hub_workstream_id: "4ef0d53b-1777-41ae-80c6-1b69fdb34726"
---
@@ -144,11 +144,21 @@ Done when:
`tests/fixtures/wp0016/daily_triage_2026-06-26_validation_failure.partial.json`
(the 4000-char preview + validation error; full payload pending the remote pull).
2026-06-30 local retention hardening: activity-core now preserves future
llm-connect diagnostic metadata instead of dropping it at the client boundary.
`LLMConnectClient.complete()` still returns the content string for compatibility,
but records safe non-secret response fields such as `finish_reason` and `usage`
on `last_response_metadata`; the executor copies that into report artifacts,
State Hub progress detail, and working-memory notes. Invalid report raw previews
were raised from 4000 to 12000 chars. This does not recover the historical
06-26 full payload or producer-side `finish_reason`, so T01 remains wait on the
remote llm-connect log pull, but the retention gap is closed for future failures.
## Schema + Prompt Redesign For Error Locality
```task
id: ACTIVITY-WP-0016-T02
status: progress
status: done
priority: high
state_hub_task_id: "ae67ca8c-ee01-4a8d-9e8a-a0a36c999758"
```
@@ -209,6 +219,21 @@ Apply there:
4. State the value vocabularies (`action`, `confidence`) the T04 guardrails will
check.
2026-06-30 live evidence check: the 2026-06-28 and 2026-06-29 scheduled
`daily_triage` events validated successfully, which shows the runtime is no
longer failing every day. However, the preserved State Hub reports still contain
10 recommendations, not the requested bounded top-N of 7 / framed item contract.
Treat that as evidence that the runtime-projected prompt/schema/max-token bundle
has not fully absorbed the T02 handoff yet.
2026-06-30 source projection closeout: patched `k8s/railiance/20-runtime.yaml`
so the projected `daily-statehub-wsjf-triage.md` prompt now says at most 7
recommendations and instructs the model to emit fewer well-formed items rather
than more. The projected `daily-triage-report.json` now has `maxItems: 7` and
`rank.maximum: 7`, aligned with the repo schema. `max_tokens: 1800` remains as
headroom for the bounded report. T02 is done in source; live deployment and an
observed <=7 recommendation run remain under T05.
## Boundary Parser — Verify & Mitigate (Posture B)
```task
@@ -368,6 +393,19 @@ Done when:
is cluster/operator work outside this repo's SCOPE. T05 therefore stays
`progress` until that live run exists; the in-repo deliverables are done.
2026-06-30 follow-up: added forward-looking diagnostics so future validation
failures carry llm-connect response metadata and a larger bounded raw-output
preview in activity-core-owned evidence. Focused verification passed:
`uv run pytest tests/test_llm_client.py tests/rules/test_executor.py tests/test_report_sinks.py -q`
=> 39 passed. This improves future root-cause ability but does not replace the
required live smoke proving graceful degradation on railiance01.
2026-06-30 projection follow-up: local source projection now enforces the top-7
prompt/schema contract. Remaining T05 proof is operational: deploy or sync the
updated `k8s/railiance/20-runtime.yaml`, run `actcore-sync`/schedule smoke or wait
for the next 07:20 Berlin fire, then confirm State Hub `daily_triage` evidence is
`output_validated=true` with no more than 7 recommendations.
## Relationships
- **Blocks / feeds:** `ACTIVITY-WP-0006-T03` (three clean scheduled runs) and

View File

@@ -0,0 +1,248 @@
---
id: ACTIVITY-WP-0018
type: workplan
title: "Own-infrastructure automation status surface"
domain: infotech
repo: activity-core
status: finished
owner: codex
topic_slug: automation-observability
created: "2026-06-29"
updated: "2026-06-29"
state_hub_workstream_id: "0220b38b-7c73-4601-9601-5f2c1a5b29e8"
---
# Own-infrastructure automation status surface
## Goal
Make activity-core's own scheduling and evidence infrastructure the explicit
operating preference for durable automations, independent of any coding
assistant-provided scheduler or reminder system.
An operator should be able to answer a question like "How did our automations go
since Friday?" with a repo-native command that does not require an LLM. Coding
assistants may inspect or summarize that command's output, but they must not be
the source of truth for scheduled execution, run history, or operational
evidence.
## Review notes
The repo already owns the correct infrastructure direction:
- `SCOPE.md` defines activity-core as the org-wide event bridge for cron,
one-off scheduled datetime, and event-triggered automation.
- `Makefile` exposes sync and service targets, but no operator status target for
recent automation outcomes.
- `docs/runbook.md` documents daily-triage verification through
`scripts/verify_daily_triage.py`, but that helper is activity-specific and
still reads like a checklist rather than the baseline answer surface for all
automations.
- Existing workplan evidence shows the status question is operationally common:
2026-06-24 and 2026-06-25 daily triage runs were clean, while 2026-06-26 and
2026-06-27 fired on schedule but failed output validation. That distinction is
exactly what the baseline command must make obvious.
## Task: Codify the own-infra scheduling preference
```task
id: ACTIVITY-WP-0018-T01
status: done
priority: high
state_hub_task_id: "00127678-5ce4-4cb3-b81c-f42e04407c73"
```
Record the repository preference that durable automation scheduling, execution
history, and run evidence belong to activity-core's own infrastructure: Temporal
Schedules, NATS JetStream, activity-core run records, State Hub progress, and
working-memory/report sinks.
Acceptance:
- `AGENTS.md` repo-specific instructions say not to use coding
assistant-provided automation tooling as the execution or evidence source for
activity-core automations.
- `SCOPE.md` and `docs/runbook.md` describe coding assistants as callers or
summarizers of repo-native automation commands, not as schedulers.
- The preference distinguishes durable automation from harmless local session
reminders: production/operational recurrence belongs to activity-core.
- The text names the authoritative evidence sources and avoids tying the policy
to any one assistant product.
2026-06-29 progress: Added the immediate repo-agent instruction in AGENTS.md
that durable activity-core automations must use repo-owned infrastructure, not
coding assistant automation/reminder/heartbeat tooling, as the execution or
evidence source. Remaining T01 work is to carry the same preference into
SCOPE.md and docs/runbook.md.
## Task: Define the automation status evidence contract
```task
id: ACTIVITY-WP-0018-T02
status: done
priority: high
state_hub_task_id: "17e6bb87-d4bf-4ef3-b91c-4bdfe2fe3492"
```
Define a small, deterministic report contract for answering recent automation
status questions across all ActivityDefinitions.
Acceptance:
- The contract covers schedule state, expected fires in the requested window,
observed workflow runs, `activity_runs` rows, State Hub progress events,
working-memory/report sink evidence, and known validation or sink failures.
- It defines normalized statuses such as `completed`, `running`, `retrying`,
`validation_failed`, `sink_failed`, `missed`, `disabled`, and `unknown`.
- Partial data is explicit: if Temporal, Postgres, State Hub, or a sink path is
unavailable, the report includes warnings rather than silently passing or
failing the whole check.
- The contract is safe for operator logs: no secrets, prompts, raw model output,
or credential-bearing URLs.
- The contract can be emitted as JSON for scripts and rendered as concise text
for humans.
## Task: Implement the non-LLM automation status CLI
```task
id: ACTIVITY-WP-0018-T03
status: done
priority: high
state_hub_task_id: "7831f2fc-8b76-48fe-aa34-9dcc11ee84db"
```
Add a deterministic CLI, likely under `scripts/automation_status.py` or an
`activity_core` module, that answers recent automation status questions without
calling an LLM.
Acceptance:
- Supports `--since`, `--until`, activity name/id filters, JSON output, and a
concise human summary.
- Accepts simple operator dates, including absolute dates and a documented
`friday`/`last-friday` style shortcut, resolving them to concrete dates in the
configured timezone.
- Inspects all enabled scheduled ActivityDefinitions by default, not just daily
triage.
- Uses live sources when configured: Postgres `activity_definitions` /
`activity_runs`, Temporal schedule and workflow visibility, State Hub
progress, and configured local report sink paths.
- Degrades usefully when a source is unavailable and exits non-zero only for
real status failures or invalid input, not for optional evidence gaps that are
clearly reported.
- Includes focused unit tests with fixture data for clean runs, validation
failures, missed runs, disabled schedules, and partial-source availability.
## Task: Add the Make target baseline
```task
id: ACTIVITY-WP-0018-T04
status: done
priority: high
state_hub_task_id: "451bdf62-b619-4ace-9262-46d20b912781"
```
Expose the CLI through a Make target that is easy for an operator or any coding
assistant to run before attempting a prose summary.
Acceptance:
- `make automation-status SINCE=2026-06-26` prints the human-readable baseline.
- `make automation-status SINCE=friday` is supported or documented with the
exact accepted shortcut.
- A JSON form is available, either through `FORMAT=json` or a separate target
such as `make automation-status-json`.
- The target does not require LLM credentials, coding assistant automation
tooling, or interactive prompts.
- `make help` lists the target with a clear one-line description.
## Task: Update operator docs and examples
```task
id: ACTIVITY-WP-0018-T05
status: done
priority: medium
state_hub_task_id: "233659aa-e14a-4b3d-b156-d04f0fa16db6"
```
Update the runbook so "How did automations go since Friday?" has an obvious
operator recipe.
Acceptance:
- `docs/runbook.md` has a short "Automation status" section near the scheduling
operations.
- The docs include example output or a compact sample for the known daily
triage distinction: fired on time versus completed successfully versus output
validation failure.
- The docs clarify that LLM summaries are optional convenience only; the Make
target output is the baseline evidence.
- The daily-triage-specific helper is either kept as a lower-level diagnostic or
folded into the generalized status command.
## Task: Verify against recent scheduled-run evidence
```task
id: ACTIVITY-WP-0018-T06
status: done
priority: medium
state_hub_task_id: "24efbe9f-dfff-482f-9edc-456379c9a2aa"
```
Prove the new surface against the recent evidence that motivated this workplan.
Acceptance:
- Running the status command over the window starting Friday, 2026-06-26 shows
that the daily triage schedule fired on 2026-06-26 and 2026-06-27 but did not
produce clean validated reports.
- The command distinguishes scheduling health from output/schema validation
failure.
- Disabled or waiting schedules, such as the weekly coding retro gate when its
upstream read model is not available, are reported without being counted as
missed runs.
- Verification results are recorded in this workplan and as a State Hub progress
note once the implementation lands.
## Implementation Result
Completed 2026-06-29: implemented the own-infrastructure automation status
surface and codified the scheduling preference.
Delivered:
- `AGENTS.md` now states that durable activity-core automations use repo-owned
infrastructure, not coding assistant automation/reminder/heartbeat tooling, as
execution or evidence authority.
- `SCOPE.md` and `docs/runbook.md` describe the deterministic status surface and
assistant boundary.
- `src/activity_core/automation_status.py` and `scripts/automation_status.py`
provide the non-LLM CLI.
- `make automation-status SINCE=...` and `make automation-status-json` expose the
baseline operator commands.
- `tests/test_automation_status.py` covers date shortcuts, cron fire estimation,
completed runs, validation failures, missed runs, disabled schedules, partial
source availability, and working-memory evidence parsing.
Verification:
```bash
python3 -m py_compile src/activity_core/automation_status.py scripts/automation_status.py tests/test_automation_status.py
/home/worsch/.local/bin/uv run pytest tests/test_automation_status.py tests/test_daily_triage_verifier.py -q
/home/worsch/.local/bin/uv run python scripts/automation_status.py \
--since 2026-06-26 --until 2026-06-27 --db-url '' \
--progress-event-type daily_triage --timeout-seconds 10 \
--working-memory-dir /tmp --format json
```
Results:
- focused tests: `11 passed`;
- `make help` lists `automation-status` and `automation-status-json`;
- the 2026-06-26 through 2026-06-27 status run exited `1` as expected because
State Hub evidence classified daily triage activity
`6fca51fa-387a-4fd0-bc4e-d62c29eb859a` as `validation_failed` with two
non-secret evidence records: 2026-06-26 `Expecting ',' delimiter` and
2026-06-27 `Unterminated string`;
- the same report classified the gated weekly coding retro as `disabled`, not
`missed`.

View File

@@ -0,0 +1,164 @@
---
id: ACTIVITY-WP-0019
type: workplan
title: "Automation schedule inventory Make targets"
domain: infotech
repo: activity-core
status: ready
owner: codex
topic_slug: automation-inventory
created: "2026-06-29"
updated: "2026-06-29"
state_hub_workstream_id: "21c73763-9adc-42f6-8fd2-1b8b33c2c770"
---
# Automation schedule inventory Make targets
## Goal
Provide a repo-native, non-LLM way to list every scheduled automation that
activity-core knows about.
`ACTIVITY-WP-0018` added the status surface for questions like "How did our
automations go since Friday?". The next operator question is the inventory
baseline: "What automations are scheduled at all?" That should be answerable
through Make targets backed by activity-core's own ActivityDefinitions,
database, and Temporal schedule metadata when available, independent of any
coding assistant automation infrastructure.
## Review notes
- `Makefile` currently exposes `automation-status` and
`automation-status-json`, but no dedicated inventory/list target.
- `scripts/automation_status.py` and `src/activity_core/automation_status.py`
already load scheduled ActivityDefinitions and compute their Temporal schedule
ids. The inventory target should reuse that parsing/loading posture where it
fits rather than creating a second discovery path.
- `make sync-schedules` reconciles Temporal schedules from the
`activity_definitions` database, but it is an action target, not a read-only
operator inventory command.
- The inventory command should remain useful in degraded local mode: file-backed
definitions are enough to list configured scheduled automations, while live
DB and Temporal visibility can enrich the output.
## Task: Define the automation inventory contract
```task
id: ACTIVITY-WP-0019-T01
status: todo
priority: high
state_hub_task_id: "8de24590-f9ee-4d0e-8692-b7ada9f232ed"
```
Define the fields and source precedence for a deterministic scheduled
automation inventory report.
Acceptance:
- The report includes every ActivityDefinition with `trigger_type` of `cron` or
`scheduled`, including disabled definitions.
- Each row includes id, name, enabled/disabled state, trigger type, schedule
expression or one-shot datetime, timezone, overlap/catchup policy when known,
and the derived Temporal schedule id.
- The report identifies its source for each row: database, repo definition file,
Temporal visibility, or a combination.
- If Temporal is reachable, the report adds paused/missing/drift hints without
mutating schedules.
- Missing optional sources produce warnings, not silent omissions.
- The JSON shape is stable enough for scripts and tests.
## Task: Implement a non-mutating inventory CLI
```task
id: ACTIVITY-WP-0019-T02
status: todo
priority: high
state_hub_task_id: "538cb9a5-48f3-470c-8518-29ee66c96678"
```
Add a deterministic CLI path for listing scheduled automations without requiring
LLM credentials or coding assistant tooling.
Acceptance:
- A script or module command, likely sharing code with
`activity_core.automation_status`, supports human and JSON output.
- The command is read-only: it does not call `sync-schedules`, upsert schedules,
delete schedules, enqueue workflows, or write State Hub evidence.
- It supports filters by activity id, activity name, enabled state, and trigger
type.
- It loads from the database when configured and falls back to repo definition
files when the database is unavailable or explicitly disabled.
- It optionally enriches rows from Temporal when `TEMPORAL_HOST` is configured,
with bounded timeouts so an unreachable service does not hang the command.
- Unit tests cover DB rows, file fallback, disabled definitions, Temporal
enrichment unavailable, and JSON output.
## Task: Add Make targets
```task
id: ACTIVITY-WP-0019-T03
status: todo
priority: high
state_hub_task_id: "f2001721-07f3-42f5-a15e-0c7d1b0ed801"
```
Expose the inventory command through Make targets that are easy for humans,
scripts, and coding assistants to run before asking for a prose summary.
Acceptance:
- `make automation-list` prints a concise human-readable inventory.
- `make automation-list-json` emits the same inventory as JSON.
- Optional Make variables pass through cleanly, for example `ENABLED=true`,
`TRIGGER=cron`, `ACTIVITY_ID=<uuid>`, or `FORMAT=json`.
- `make help` lists both targets with clear one-line descriptions.
- The targets do not require LLM access, Codex automation tooling, or
interactive prompts.
## Task: Document the inventory workflow
```task
id: ACTIVITY-WP-0019-T04
status: todo
priority: medium
state_hub_task_id: "f687743b-3936-413e-ae50-d35484ae9a81"
```
Update operator documentation so the scheduled automation inventory path is
discoverable next to the status path.
Acceptance:
- `docs/runbook.md` documents `make automation-list` and
`make automation-list-json`.
- The docs distinguish inventory from status: inventory answers what is
configured; status answers what happened in a time window.
- The docs state that the command is read-only and uses activity-core-owned
scheduling evidence.
- The docs include a compact example of the expected human output.
## Task: Verify against current repo and live/degraded sources
```task
id: ACTIVITY-WP-0019-T05
status: todo
priority: medium
state_hub_task_id: "5317b532-5cef-4eff-b6d8-3e85bbca8e8a"
```
Prove the target against the current scheduled automation definitions and
degraded local conditions.
Acceptance:
- `make automation-list` shows the current scheduled automations, including
daily triage and weekly scheduled definitions when present in the selected
source.
- JSON output is valid and includes the same rows.
- A DB-unavailable run falls back to repo definition files or reports a clear
warning if no definitions are discoverable.
- A Temporal-unavailable run exits successfully with Temporal warnings rather
than hanging.
- Focused tests pass and the result is recorded in this workplan before the
workplan is moved to `finished`.