Files
the-custodian/ops
codex cf4be716e1 CUST-WP-0054 T01-T03: fleet architecture, de-hub runbook, drain plan
Documents the three-machine role model, fleet mesh topology, coulombcore
freeze policy, and ordered drain sequence. Adds railiance01 systemd tunnel
install assets and refreshes ops service inventory to reflect 2026-07-03
production placement (cluster State Hub, fleet mesh, draining coulombcore).
2026-07-04 00:29:55 +02:00
..

Ops Documentation

Operational runbooks and incident reports for the Railiance/Custodian infrastructure.

Structure

ops/
  service-inventory.yml  — non-secret service/location/evidence seed for ops-hub
  runbooks/   — how-to guides for recurring operational tasks and known issues
  incidents/  — post-incident reports (append-only, one file per incident)

Inventory

Artifact Covers
service-inventory.yml Initial ops-hub service inventory: environments, hosts, clusters, services, endpoints, access paths, evidence, and gaps
../docs/ops-hub-service-catalog.md Rendered service catalog now view generated from the inventory

Render the first catalog view with:

make ops-inventory-view

Runbooks

Runbook Covers
gitea-coulombcore.md Gitea on COULOMBCORE k3s — access, known issues, recovery checklist

Incidents

ID Date Summary Status
INC-001 2026-03-25 Gitea down 13d — PGPool containerd StartError + CPU exhaustion Resolved