Documents the dashboard's architecture, framework choice rationale, data-fetching strategies (static loaders + live polling), component library, page inventory, and key features including the Workstream Health Index and entity modals. Also registers the new page in the Reference nav and adds runbook section for node overload / runaway agent process (INC-002) with hardening checklist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ops Documentation
Operational runbooks and incident reports for the Railiance/Custodian infrastructure.
Structure
ops/
runbooks/ — how-to guides for recurring operational tasks and known issues
incidents/ — post-incident reports (append-only, one file per incident)
Runbooks
| Runbook | Covers |
|---|---|
| gitea-coulombcore.md | Gitea on COULOMBCORE k3s — access, known issues, recovery checklist |
Incidents
| ID | Date | Summary | Status |
|---|---|---|---|
| INC-001 | 2026-03-25 | Gitea down 13d — PGPool containerd StartError + CPU exhaustion | Resolved |