Files
railiance-cluster/docs/operator-runbook.md
tegwick b3b0c3e3ff
Some checks failed
railiance-tests / smoke (push) Has been cancelled
Repo hygiene + new workplans (RAIL-BS-WP-0008/0009)
- Add RAIL-BS-WP-0008 (activity-core WP-0016 deploy) and RAIL-BS-WP-0009
  (admin-sync smoke) from inbox asks 87952ff1 / aa8b7986
- Archive finished workplans to workplans/archived/ per ADR-001 convention;
  normalize frontmatter statuses (completed/done -> finished)
- Fill stack-and-commands.md, complete repo-boundary.md, refresh SCOPE
  Current State, add docs/operator-runbook.md for production-touching targets

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 00:02:36 +02:00

1.6 KiB

Operator runbook — production-touching commands

All targets below change state on the production k3s cluster (railiance01 / COULOMBCORE, 92.205.130.254) or its backups. Agent sessions running in auto mode are denied these by the permission classifier — that is intentional.

How to run a production-touching target

  • Interactively in a Claude Code session: type ! <command> so the command runs under the operator's authority and the output lands in the conversation for the agent to act on.
  • Directly: run from this repo root on the workstation; cluster access is ssh railiance01 (key-based, configured in ~/.ssh/config).

Production-touching targets

Target Effect
sudo make backup writes age-encrypted backup to /opt/backup/railiance/cluster/
make k3s-install (re)installs k3s baseline — destructive, preflight first
make test-ha-failover kills the primary PG pod to assert recovery
make verify-activity-core reconciles activity-core runtime on railiance01
make reconcile-activity-core-llm-connect patches ConfigMap, applies llm-connect overlay, runs smoke pod

Read-only / safe targets

make help, make preflight, make smoke, make restore (prints guide only). These are safe to allowlist for agent sessions.

Evidence convention

Reconcile/verify targets post non-secret evidence notes to the State Hub (STATE_HUB_EVIDENCE_WORKSTREAM_ID / STATE_HUB_EVIDENCE_TASK_ID env vars attach them to a workstream/task). Never record Secret values — key counts and readiness states only.