diff --git a/Makefile b/Makefile index ed9ea83..8fbfa3b 100644 --- a/Makefile +++ b/Makefile @@ -2,6 +2,14 @@ INVENTORY ?= ansible/hosts.ini +##@ Safety Net + +backup: ## Backup postgres + config to Nextcloud (age-encrypted) + bin/railiance backup + +preflight: ## Pre-migration safety gate — must pass before cluster work + bin/railiance preflight + ##@ Kubernetes k3s-install: ## Install k3s and Helm on all inventory hosts diff --git a/docs/backup-restore.md b/docs/backup-restore.md index 419521b..2c2c24d 100644 --- a/docs/backup-restore.md +++ b/docs/backup-restore.md @@ -188,10 +188,23 @@ This restores `~/.claude/`, `~/.claude.json`, and `~/.gitconfig`. ### Step 5 — Clone repositories +OAS Stack repos (S1–S5, per ADR-003): + ```bash -git clone /coulomb/railiance-bootstrap.git ~/railiance-bootstrap -git clone /tegwick/the-custodian.git ~/the-custodian -git clone /coulomb/markitect_project.git ~/markitect_project +git clone /coulomb/railiance-infra.git ~/railiance-infra +git clone /coulomb/railiance-cluster.git ~/railiance-cluster +git clone /coulomb/railiance-platform.git ~/railiance-platform +git clone /coulomb/railiance-enablement.git ~/railiance-enablement +git clone /coulomb/railiance-apps.git ~/railiance-apps +``` + +Core and project repos: + +```bash +git clone /tegwick/the-custodian.git ~/the-custodian +git clone /coulomb/markitect_project.git ~/markitect_project +git clone /coulomb/activity-core.git ~/activity-core +git clone /coulomb/net-kingdom.git ~/net-kingdom # ... remaining repos as needed ``` diff --git a/tools/cmd/railiance-preflight b/tools/cmd/railiance-preflight index 582de00..323e6d1 100755 --- a/tools/cmd/railiance-preflight +++ b/tools/cmd/railiance-preflight @@ -10,12 +10,21 @@ source "${ROOT}/lib/railiance-print.sh" BACKUP_DIR="${XDG_CACHE_HOME:-$HOME/.cache}/railiance/backups" MAX_AGE_HOURS=24 REPOS=( + # OAS Stack (S1–S5) — railiance-infra/docs/adr/ADR-003 + /home/worsch/railiance-infra + /home/worsch/railiance-cluster + /home/worsch/railiance-platform + /home/worsch/railiance-enablement + /home/worsch/railiance-apps + # Core infrastructure + /home/worsch/the-custodian + # Project repos + /home/worsch/markitect_project + /home/worsch/activity-core + /home/worsch/net-kingdom /home/worsch/issue-facade /home/worsch/binect-js /home/worsch/kaizen-agentic - /home/worsch/railiance-bootstrap - /home/worsch/the-custodian - /home/worsch/markitect_project ) # ── Helpers ─────────────────────────────────────────────────────────────────── diff --git a/workplans/RAIL-BS-WP-0004-safety-net.md b/workplans/RAIL-BS-WP-0004-safety-net.md new file mode 100644 index 0000000..b793bc0 --- /dev/null +++ b/workplans/RAIL-BS-WP-0004-safety-net.md @@ -0,0 +1,166 @@ +--- +id: RAIL-BS-WP-0004 +type: workplan +title: "Current-Environment Safety Net" +domain: railiance +repo: railiance-cluster +status: active +owner: tegwick +topic_slug: railiance +state_hub_workstream_id: "7e8b0c20-51eb-40c9-9e3b-85dd380d7625" +created: "2026-02-25" +updated: "2026-03-10" +--- + +# Current-Environment Safety Net + +## Goal + +Ensure backup and disaster recovery for the current single-server environment +is operational and tested before any ThreePhoenix infrastructure migration +work begins. Aligned to OAS Stack S2 (railiance-cluster owns backup tooling). + +## Context + +The backup toolchain lives in `tools/cmd/railiance-backup` and +`tools/cmd/railiance-preflight`, dispatched via `bin/railiance`. It protects: + +| Asset | Method | Risk without backup | +|---|---|---| +| Custodian State Hub DB | pg_dump → age → Nextcloud | Total loss of workstreams, decisions, history | +| Claude config + memory | tar → age → Nextcloud | Loss of MCP registration, project memory | +| Git repos | Gitea remotes | SPOF: Gitea runs on the same server being migrated | + +Decision D2: Nextcloud upload-only file drop as backup destination. + +## OAS Alignment + +Per ADR-003, backup tooling lives in **S2 (railiance-cluster)**. The preflight +check covers all five OAS stack repos: + +| Repo | OAS Layer | +|---|---| +| railiance-infra | S1 — OS & Provisioning | +| railiance-cluster | S2 — Kubernetes Runtime | +| railiance-platform | S3 — Platform Services | +| railiance-enablement | S4 — Developer Tooling | +| railiance-apps | S5 — Workloads & Endpoints | + +Plus cross-domain repos: the-custodian, markitect_project, activity-core, +net-kingdom, issue-facade, binect-js, kaizen-agentic. + +## Boundary + +Backup execution: this repo (`bin/railiance backup`). +Backup destination: Nextcloud file drop (URL in `~/.config/railiance/nc-upload-url` or hardcoded). +Restore procedure: `docs/backup-restore.md`. + +--- + +## Tasks + +### T01 — Update preflight repo list to OAS 5-repo layout + +```task +id: T01 +status: done +priority: high +``` + +Update `tools/cmd/railiance-preflight` REPOS array: remove `railiance-bootstrap`, +add `railiance-infra`, `railiance-cluster`, `railiance-platform`, +`railiance-enablement`, `railiance-apps`. Add all active project repos. + +**Done when:** `bin/railiance preflight` checks all current repos. + +--- + +### T02 — Fix stale repo references in backup-restore.md + +```task +id: T02 +status: done +priority: medium +``` + +Update restore procedure: `railiance-bootstrap` → `railiance-cluster`, +`railiance-hosts` → `railiance-infra`, add the three new OAS repos. + +**Done when:** doc accurately reflects the current 5-repo OAS stack. + +--- + +### T03 — Add make backup and make preflight targets + +```task +id: T03 +status: done +priority: medium +``` + +Add to root Makefile so the safety net is discoverable from `make help`. + +**Done when:** `make backup` and `make preflight` both work. + +--- + +### T04 — Run current backup and verify upload + +```task +id: T04 +status: done +priority: high +``` + +Run `bin/railiance backup` and confirm both DB and config files appear +in the Nextcloud file drop. + +**Done when:** backup completes without error and `.last-backup` stamp is fresh. + +--- + +### T05 — Verify or install cron job + +```task +id: T05 +status: todo +priority: medium +``` + +Confirm that the daily 02:00 cron job is installed and has run at least once: + +```bash +crontab -l | grep railiance +cat ~/.cache/railiance/backup.log | tail -20 +``` + +If missing, install: +```bash +(crontab -l 2>/dev/null; echo "0 2 * * * /home/worsch/railiance-cluster/bin/railiance backup >> ~/.cache/railiance/backup.log 2>&1") | crontab - +``` + +**Done when:** cron is listed and log shows a successful run. + +--- + +### T06 — Run restore drill + +```task +id: T06 +status: todo +priority: medium +``` + +Run the minimal restore drill from `docs/backup-restore.md` against the +current backup. Record completion in `~/.cache/railiance/restore-drill.log`. + +**Done when:** drill exits 0 and log entry is written. + +--- + +## References + +- Decision D2: Nextcloud as backup destination (`DECISIONS.md`) +- Backup tooling: `tools/cmd/railiance-backup`, `tools/cmd/railiance-preflight` +- Restore procedure: `docs/backup-restore.md` +- Extension points: EP-RAIL-003 (git bare mirrors), EP-RAIL-004 (secondary offsite copy)