generated from coulomb/repo-seed
docs(state-hub): STATE-WP-0063 T03 done — tunnel cleanup restored activity-core
Document stale remote sshd forward on Railiance01 :18000 as root cause of reconnect loop; T03 verified after bridge maintenance cleanup and manual canaries for hourly RecentlyOnScope and daily WSJF triage.
This commit is contained in:
@@ -115,12 +115,15 @@ Result 2026-06-21:
|
||||
**workstation availability + ops-bridge tunnel reachability** to the local
|
||||
State Hub API. Cluster-side `actcore-state-hub-bridge` proxies to node-local
|
||||
`127.0.0.1:18000` as designed.
|
||||
- **Retry fix (2026-06-21):** stale orphan `sshd` remote forward on Railiance01
|
||||
port 18000 blocked new tunnel binds (`ExitOnForwardFailure` → exit 255).
|
||||
`bridge maintenance cleanup state-hub-railiance01 --restart` cleared it.
|
||||
|
||||
## T3 — Restore daily WSJF triage evidence
|
||||
|
||||
```task
|
||||
id: STATE-WP-0063-T03
|
||||
status: progress
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "4b68b207-a80e-4c71-a246-10035ef69625"
|
||||
```
|
||||
@@ -134,12 +137,17 @@ Railiance01 and confirm:
|
||||
If the schedule was paused, unpause and verify the next 07:20 Europe/Berlin
|
||||
fire is armed.
|
||||
|
||||
Progress 2026-06-21: Manual trigger via actcore-api returned workflow
|
||||
`activity-6fca51fa…:manual-ca469cb5…`. Schedule is armed (next 07:20 Europe/Berlin).
|
||||
Report sink `state-hub-progress` is still failing with `[Errno 111] Connection
|
||||
refused` while `state-hub-railiance01` tunnel is reconnecting. Re-verify after
|
||||
tunnel stabilises (`bridge check state-hub-railiance01` + new `daily_triage`
|
||||
progress event).
|
||||
Progress 2026-06-21 (first attempt): sink failed while tunnel reconnecting.
|
||||
|
||||
Result 2026-06-21 (retry): `bridge maintenance cleanup state-hub-railiance01
|
||||
--restart` cleared stale remote `sshd` on port 18000 (orphan forward with many
|
||||
CLOSE_WAIT). Tunnel now **connected**; node `curl 127.0.0.1:18000/state/health`
|
||||
and worker `actcore-state-hub-bridge:8000/state/health` both return ok. Manual
|
||||
canaries succeeded:
|
||||
|
||||
- hourly RecentlyOnScope → `recently_on_scope_hourly` at 2026-06-21T17:45:29Z
|
||||
- daily WSJF triage → `daily_triage` at 2026-06-21T17:45:46Z + working-memory
|
||||
report under `the-custodian/memory/working/`
|
||||
|
||||
## T4 — Repair ancillary local crons
|
||||
|
||||
|
||||
Reference in New Issue
Block a user