Files
the-custodian/ops
tegwick b19896a9a9 docs(dashboard): add technical reference page for Observable Framework dashboard
Documents the dashboard's architecture, framework choice rationale, data-fetching
strategies (static loaders + live polling), component library, page inventory,
and key features including the Workstream Health Index and entity modals.
Also registers the new page in the Reference nav and adds runbook section for
node overload / runaway agent process (INC-002) with hardening checklist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 00:09:18 +01:00
..

Ops Documentation

Operational runbooks and incident reports for the Railiance/Custodian infrastructure.

Structure

ops/
  runbooks/   — how-to guides for recurring operational tasks and known issues
  incidents/  — post-incident reports (append-only, one file per incident)

Runbooks

Runbook Covers
gitea-coulombcore.md Gitea on COULOMBCORE k3s — access, known issues, recovery checklist

Incidents

ID Date Summary Status
INC-001 2026-03-25 Gitea down 13d — PGPool containerd StartError + CPU exhaustion Resolved