Set Temporal catchup_window on cron schedules so a fire missed during a
worker/Temporal outage is no longer silently dropped. Redefine misfire_policy
into three explicit modes — skip, catchup_all, catchup_latest — mapping to
(catchup_window, overlap) pairs; legacy catchup/compress aliased. Add
catchup_window_seconds override. Remove the ad-hoc upsert-time 1h backfill in
favour of native catchup. Apply catchup_latest to daily-statehub-wsjf-triage in
the Railiance runtime manifest and document run-miss policies in the runbook.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The actcore-state-hub-bridge readiness probe hit /state/summary through
the tunnel proxy chain. Cold-cache summary requests and intermittent
tunnel stalls routinely exceeded the 5s probe timeout (1584 failures
over 17h), leaving the pod 0/1 Ready and breaking hourly/triage sinks.
Use /state/health instead — same signal the ops inventory already
expects, and completes in ~30ms through the bridge.