Files
state-hub/history/20260621-weekend-automation-assessment.md
tegwick 3d5e354ff8 docs(state-hub): weekend automation assessment and repair workplans
Persist the Fri-evening→Sun-afternoon automation gap assessment in
history/, and add STATE-WP-0063 (repair broken paths and cluster
reachability) plus STATE-WP-0064 (move State Hub consistency sync to
Railiance01 via activity-core). Workplans registered in State Hub via
fix-consistency.
2026-06-21 17:32:44 +02:00

4.1 KiB

id, type, title, domain, repo, created, assessor, session_boundary
id type title domain repo created assessor session_boundary
STATE-HIST-20260621-WEEKEND-AUTOMATION assessment Weekend automation gap assessment (Fri evening → Sun afternoon) custodian state-hub 2026-06-21 grok 2026-06-19T19:22Z (last interactive milestone)

Weekend automation gap assessment

Assessment window: Friday 2026-06-19 ~21:22 CEST (session close) through Sunday 2026-06-21 ~16:00 CEST (resumption). Sources: State Hub /progress/ API, activity-definition files, local journalctl for custodian-sync.service, and crontab.

Scheduled automation landscape

activity-core runs on Railiance01 (K3s + Temporal), not on the WSL workstation. Custodian-owned ActivityDefinitions in the-custodian/activity-definitions/:

Activity Schedule Enabled Effect
Hourly RecentlyOnScope 0 * * * * Europe/Berlin yes POST /recently-on-scope/hourly
Daily State Hub WSJF Triage 20 7 * * * Europe/Berlin yes LLM triage → daily_triage progress event
Ops Service Inventory Probes 15 * * * * Europe/Berlin no HTTP service probes

activity-core repo definitions: weekly SBOM staleness (Mon 09:00, enabled); weekly coding retro (Sat 19:00, disabled).

Not yet on activity-core: the 15-minute workplan↔DB consistency sweep still uses the local custodian-sync.timer systemd user unit.

What ran automatically

Friday evening (before/at session close)

  • 20:00 CEST (18:00 UTC): Last successful hourly RecentlyOnScope run. Generated one helix_forge digest; 13 domains skipped.
  • After 20:00 CEST: No further hourly runs recorded (19:00 and 20:00 UTC slots absent from progress events).

Saturday 2026-06-20

  • Zero State Hub progress events for the entire day.
  • 07:20 CEST daily WSJF triage: did not run (no daily_triage event; no new file in the-custodian/memory/working/ since 2026-06-18).
  • Workstation journal shows no custodian-sync activity between Fri ~21:18 and Sun ~15:50 CEST — consistent with sleep/hibernate or WSL not running.

Sunday 2026-06-21 (before interactive session)

  • 16:00 CEST (14:00 UTC): Hourly RecentlyOnScope resumed. Result: 0 generated, 14 skipped, 0 failed (all domains quiet).
  • 07:20 CEST daily WSJF triage: did not run.
  • After 16:09 CEST: Interactive grok/custodian session work (not automation).

Local maintenance (broken, not activity-core)

custodian-sync.service fired every ~15 minutes when the machine was awake but failed continuously with:

.venv/bin/python: No such file or directory

Root cause: unit WorkingDirectory still points at the pre-extraction path /home/worsch/the-custodian/state-hub; the standalone repo lives at /home/worsch/state-hub.

Other local crons also failed when they fired:

  • 02:00 railiance backup/home/worsch/railiance-bootstrap/bin/railiance: not found
  • 03:00 bridge maintenance cleanup — no log evidence

Gap summary

Automation Expected over weekend Observed
Hourly RecentlyOnScope ~44 hourly runs 1 (Sun 16:00 only) after ~44 h gap
Daily WSJF triage 2 runs (Sat + Sun 07:20) 0
State Hub consistency sweep ~180 runs (15 min) All failed when machine awake (bad path)
Saturday hub activity n/a None recorded

Naming note

The local systemd unit is called custodian-sync, but the job reconciles workplan files ↔ State Hub DB for all registered repos. The cron-migration design stub already uses state-hub-consistency-sweep as the ActivityDefinition id. Prefer State Hub consistency sync for operator-facing names; retain custodian-sync-hook in git hooks only until a deliberate rename pass (hook marker is widespread).

Follow-up workplans

  • STATE-WP-0063 — repair broken weekend automation (local paths, cluster reachability, missed triage).
  • STATE-WP-0064 — move State Hub consistency sync to Railiance01 via activity-core; retire local custodian-sync.timer after cutover.