generated from coulomb/repo-seed
docs(state-hub): weekend automation assessment and repair workplans
Persist the Fri-evening→Sun-afternoon automation gap assessment in history/, and add STATE-WP-0063 (repair broken paths and cluster reachability) plus STATE-WP-0064 (move State Hub consistency sync to Railiance01 via activity-core). Workplans registered in State Hub via fix-consistency.
This commit is contained in:
103
history/20260621-weekend-automation-assessment.md
Normal file
103
history/20260621-weekend-automation-assessment.md
Normal file
@@ -0,0 +1,103 @@
|
||||
---
|
||||
id: STATE-HIST-20260621-WEEKEND-AUTOMATION
|
||||
type: assessment
|
||||
title: "Weekend automation gap assessment (Fri evening → Sun afternoon)"
|
||||
domain: custodian
|
||||
repo: state-hub
|
||||
created: "2026-06-21"
|
||||
assessor: grok
|
||||
session_boundary: "2026-06-19T19:22Z (last interactive milestone)"
|
||||
---
|
||||
|
||||
# Weekend automation gap assessment
|
||||
|
||||
Assessment window: **Friday 2026-06-19 ~21:22 CEST** (session close) through
|
||||
**Sunday 2026-06-21 ~16:00 CEST** (resumption). Sources: State Hub
|
||||
`/progress/` API, activity-definition files, local `journalctl` for
|
||||
`custodian-sync.service`, and crontab.
|
||||
|
||||
## Scheduled automation landscape
|
||||
|
||||
activity-core runs on **Railiance01 (K3s + Temporal)**, not on the WSL
|
||||
workstation. Custodian-owned ActivityDefinitions in
|
||||
`the-custodian/activity-definitions/`:
|
||||
|
||||
| Activity | Schedule | Enabled | Effect |
|
||||
|----------|----------|---------|--------|
|
||||
| Hourly RecentlyOnScope | `0 * * * *` Europe/Berlin | yes | `POST /recently-on-scope/hourly` |
|
||||
| Daily State Hub WSJF Triage | `20 7 * * *` Europe/Berlin | yes | LLM triage → `daily_triage` progress event |
|
||||
| Ops Service Inventory Probes | `15 * * * *` Europe/Berlin | no | HTTP service probes |
|
||||
|
||||
activity-core repo definitions: weekly SBOM staleness (Mon 09:00, enabled);
|
||||
weekly coding retro (Sat 19:00, disabled).
|
||||
|
||||
**Not yet on activity-core:** the 15-minute workplan↔DB consistency sweep
|
||||
still uses the local **`custodian-sync.timer`** systemd user unit.
|
||||
|
||||
## What ran automatically
|
||||
|
||||
### Friday evening (before/at session close)
|
||||
|
||||
- **20:00 CEST** (`18:00 UTC`): Last successful hourly RecentlyOnScope run.
|
||||
Generated one `helix_forge` digest; 13 domains skipped.
|
||||
- **After 20:00 CEST**: No further hourly runs recorded (19:00 and 20:00 UTC
|
||||
slots absent from progress events).
|
||||
|
||||
### Saturday 2026-06-20
|
||||
|
||||
- **Zero** State Hub progress events for the entire day.
|
||||
- **07:20 CEST daily WSJF triage**: did not run (no `daily_triage` event; no new
|
||||
file in `the-custodian/memory/working/` since 2026-06-18).
|
||||
- Workstation journal shows **no `custodian-sync` activity** between
|
||||
Fri ~21:18 and Sun ~15:50 CEST — consistent with sleep/hibernate or WSL not
|
||||
running.
|
||||
|
||||
### Sunday 2026-06-21 (before interactive session)
|
||||
|
||||
- **16:00 CEST** (`14:00 UTC`): Hourly RecentlyOnScope resumed. Result:
|
||||
**0 generated, 14 skipped, 0 failed** (all domains quiet).
|
||||
- **07:20 CEST daily WSJF triage**: did not run.
|
||||
- **After 16:09 CEST**: Interactive grok/custodian session work (not automation).
|
||||
|
||||
### Local maintenance (broken, not activity-core)
|
||||
|
||||
`custodian-sync.service` fired every ~15 minutes when the machine was awake but
|
||||
**failed continuously** with:
|
||||
|
||||
```
|
||||
.venv/bin/python: No such file or directory
|
||||
```
|
||||
|
||||
Root cause: unit `WorkingDirectory` still points at the pre-extraction path
|
||||
`/home/worsch/the-custodian/state-hub`; the standalone repo lives at
|
||||
`/home/worsch/state-hub`.
|
||||
|
||||
Other local crons also failed when they fired:
|
||||
|
||||
- **02:00** `railiance backup` — `/home/worsch/railiance-bootstrap/bin/railiance: not found`
|
||||
- **03:00** `bridge maintenance cleanup` — no log evidence
|
||||
|
||||
## Gap summary
|
||||
|
||||
| Automation | Expected over weekend | Observed |
|
||||
|------------|----------------------|----------|
|
||||
| Hourly RecentlyOnScope | ~44 hourly runs | **1** (Sun 16:00 only) after **~44 h gap** |
|
||||
| Daily WSJF triage | 2 runs (Sat + Sun 07:20) | **0** |
|
||||
| State Hub consistency sweep | ~180 runs (15 min) | **All failed** when machine awake (bad path) |
|
||||
| Saturday hub activity | n/a | **None recorded** |
|
||||
|
||||
## Naming note
|
||||
|
||||
The local systemd unit is called **custodian-sync**, but the job reconciles
|
||||
**workplan files ↔ State Hub DB for all registered repos**. The cron-migration
|
||||
design stub already uses **`state-hub-consistency-sweep`** as the
|
||||
ActivityDefinition id. Prefer **State Hub consistency sync** for operator-facing
|
||||
names; retain `custodian-sync-hook` in git hooks only until a deliberate rename
|
||||
pass (hook marker is widespread).
|
||||
|
||||
## Follow-up workplans
|
||||
|
||||
- `STATE-WP-0063` — repair broken weekend automation (local paths, cluster
|
||||
reachability, missed triage).
|
||||
- `STATE-WP-0064` — move State Hub consistency sync to Railiance01 via
|
||||
activity-core; retire local `custodian-sync.timer` after cutover.
|
||||
Reference in New Issue
Block a user