feat(safety-net): create WP-0004, update preflight for OAS 5-repo layout

- workplans/RAIL-BS-WP-0004-safety-net.md: ADR-001 workplan file for
  current-env-safety-net workstream (7e8b0c20), T01-T04 done, T05-T06 todo
- tools/cmd/railiance-preflight: update REPOS to OAS S1-S5 stack
  (railiance-infra/cluster/platform/enablement/apps) + project repos;
  remove stale railiance-bootstrap reference
- docs/backup-restore.md: fix Step 5 clone commands to current repo names
- Makefile: add make backup and make preflight targets

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-10 15:21:29 +01:00
parent 441a37c5ae
commit 75467673a8
4 changed files with 202 additions and 6 deletions

View File

@@ -2,6 +2,14 @@
INVENTORY ?= ansible/hosts.ini
##@ Safety Net
backup: ## Backup postgres + config to Nextcloud (age-encrypted)
bin/railiance backup
preflight: ## Pre-migration safety gate — must pass before cluster work
bin/railiance preflight
##@ Kubernetes
k3s-install: ## Install k3s and Helm on all inventory hosts

View File

@@ -188,10 +188,23 @@ This restores `~/.claude/`, `~/.claude.json`, and `~/.gitconfig`.
### Step 5 — Clone repositories
OAS Stack repos (S1S5, per ADR-003):
```bash
git clone <gitea-url>/coulomb/railiance-infra.git ~/railiance-infra
git clone <gitea-url>/coulomb/railiance-cluster.git ~/railiance-cluster
git clone <gitea-url>/coulomb/railiance-platform.git ~/railiance-platform
git clone <gitea-url>/coulomb/railiance-enablement.git ~/railiance-enablement
git clone <gitea-url>/coulomb/railiance-apps.git ~/railiance-apps
```
Core and project repos:
```bash
git clone <gitea-url>/coulomb/railiance-bootstrap.git ~/railiance-bootstrap
git clone <gitea-url>/tegwick/the-custodian.git ~/the-custodian
git clone <gitea-url>/coulomb/markitect_project.git ~/markitect_project
git clone <gitea-url>/coulomb/activity-core.git ~/activity-core
git clone <gitea-url>/coulomb/net-kingdom.git ~/net-kingdom
# ... remaining repos as needed
```

View File

@@ -10,12 +10,21 @@ source "${ROOT}/lib/railiance-print.sh"
BACKUP_DIR="${XDG_CACHE_HOME:-$HOME/.cache}/railiance/backups"
MAX_AGE_HOURS=24
REPOS=(
# OAS Stack (S1S5) — railiance-infra/docs/adr/ADR-003
/home/worsch/railiance-infra
/home/worsch/railiance-cluster
/home/worsch/railiance-platform
/home/worsch/railiance-enablement
/home/worsch/railiance-apps
# Core infrastructure
/home/worsch/the-custodian
# Project repos
/home/worsch/markitect_project
/home/worsch/activity-core
/home/worsch/net-kingdom
/home/worsch/issue-facade
/home/worsch/binect-js
/home/worsch/kaizen-agentic
/home/worsch/railiance-bootstrap
/home/worsch/the-custodian
/home/worsch/markitect_project
)
# ── Helpers ───────────────────────────────────────────────────────────────────

View File

@@ -0,0 +1,166 @@
---
id: RAIL-BS-WP-0004
type: workplan
title: "Current-Environment Safety Net"
domain: railiance
repo: railiance-cluster
status: active
owner: tegwick
topic_slug: railiance
state_hub_workstream_id: "7e8b0c20-51eb-40c9-9e3b-85dd380d7625"
created: "2026-02-25"
updated: "2026-03-10"
---
# Current-Environment Safety Net
## Goal
Ensure backup and disaster recovery for the current single-server environment
is operational and tested before any ThreePhoenix infrastructure migration
work begins. Aligned to OAS Stack S2 (railiance-cluster owns backup tooling).
## Context
The backup toolchain lives in `tools/cmd/railiance-backup` and
`tools/cmd/railiance-preflight`, dispatched via `bin/railiance`. It protects:
| Asset | Method | Risk without backup |
|---|---|---|
| Custodian State Hub DB | pg_dump → age → Nextcloud | Total loss of workstreams, decisions, history |
| Claude config + memory | tar → age → Nextcloud | Loss of MCP registration, project memory |
| Git repos | Gitea remotes | SPOF: Gitea runs on the same server being migrated |
Decision D2: Nextcloud upload-only file drop as backup destination.
## OAS Alignment
Per ADR-003, backup tooling lives in **S2 (railiance-cluster)**. The preflight
check covers all five OAS stack repos:
| Repo | OAS Layer |
|---|---|
| railiance-infra | S1 — OS & Provisioning |
| railiance-cluster | S2 — Kubernetes Runtime |
| railiance-platform | S3 — Platform Services |
| railiance-enablement | S4 — Developer Tooling |
| railiance-apps | S5 — Workloads & Endpoints |
Plus cross-domain repos: the-custodian, markitect_project, activity-core,
net-kingdom, issue-facade, binect-js, kaizen-agentic.
## Boundary
Backup execution: this repo (`bin/railiance backup`).
Backup destination: Nextcloud file drop (URL in `~/.config/railiance/nc-upload-url` or hardcoded).
Restore procedure: `docs/backup-restore.md`.
---
## Tasks
### T01 — Update preflight repo list to OAS 5-repo layout
```task
id: T01
status: done
priority: high
```
Update `tools/cmd/railiance-preflight` REPOS array: remove `railiance-bootstrap`,
add `railiance-infra`, `railiance-cluster`, `railiance-platform`,
`railiance-enablement`, `railiance-apps`. Add all active project repos.
**Done when:** `bin/railiance preflight` checks all current repos.
---
### T02 — Fix stale repo references in backup-restore.md
```task
id: T02
status: done
priority: medium
```
Update restore procedure: `railiance-bootstrap``railiance-cluster`,
`railiance-hosts``railiance-infra`, add the three new OAS repos.
**Done when:** doc accurately reflects the current 5-repo OAS stack.
---
### T03 — Add make backup and make preflight targets
```task
id: T03
status: done
priority: medium
```
Add to root Makefile so the safety net is discoverable from `make help`.
**Done when:** `make backup` and `make preflight` both work.
---
### T04 — Run current backup and verify upload
```task
id: T04
status: done
priority: high
```
Run `bin/railiance backup` and confirm both DB and config files appear
in the Nextcloud file drop.
**Done when:** backup completes without error and `.last-backup` stamp is fresh.
---
### T05 — Verify or install cron job
```task
id: T05
status: todo
priority: medium
```
Confirm that the daily 02:00 cron job is installed and has run at least once:
```bash
crontab -l | grep railiance
cat ~/.cache/railiance/backup.log | tail -20
```
If missing, install:
```bash
(crontab -l 2>/dev/null; echo "0 2 * * * /home/worsch/railiance-cluster/bin/railiance backup >> ~/.cache/railiance/backup.log 2>&1") | crontab -
```
**Done when:** cron is listed and log shows a successful run.
---
### T06 — Run restore drill
```task
id: T06
status: todo
priority: medium
```
Run the minimal restore drill from `docs/backup-restore.md` against the
current backup. Record completion in `~/.cache/railiance/restore-drill.log`.
**Done when:** drill exits 0 and log entry is written.
---
## References
- Decision D2: Nextcloud as backup destination (`DECISIONS.md`)
- Backup tooling: `tools/cmd/railiance-backup`, `tools/cmd/railiance-preflight`
- Restore procedure: `docs/backup-restore.md`
- Extension points: EP-RAIL-003 (git bare mirrors), EP-RAIL-004 (secondary offsite copy)