Files
railiance-cluster/workplans/RAIL-BS-WP-0004-safety-net.md
tegwick 75467673a8 feat(safety-net): create WP-0004, update preflight for OAS 5-repo layout
- workplans/RAIL-BS-WP-0004-safety-net.md: ADR-001 workplan file for
  current-env-safety-net workstream (7e8b0c20), T01-T04 done, T05-T06 todo
- tools/cmd/railiance-preflight: update REPOS to OAS S1-S5 stack
  (railiance-infra/cluster/platform/enablement/apps) + project repos;
  remove stale railiance-bootstrap reference
- docs/backup-restore.md: fix Step 5 clone commands to current repo names
- Makefile: add make backup and make preflight targets

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 15:21:29 +01:00

4.2 KiB

id, type, title, domain, repo, status, owner, topic_slug, state_hub_workstream_id, created, updated
id type title domain repo status owner topic_slug state_hub_workstream_id created updated
RAIL-BS-WP-0004 workplan Current-Environment Safety Net railiance railiance-cluster active tegwick railiance 7e8b0c20-51eb-40c9-9e3b-85dd380d7625 2026-02-25 2026-03-10

Current-Environment Safety Net

Goal

Ensure backup and disaster recovery for the current single-server environment is operational and tested before any ThreePhoenix infrastructure migration work begins. Aligned to OAS Stack S2 (railiance-cluster owns backup tooling).

Context

The backup toolchain lives in tools/cmd/railiance-backup and tools/cmd/railiance-preflight, dispatched via bin/railiance. It protects:

Asset Method Risk without backup
Custodian State Hub DB pg_dump → age → Nextcloud Total loss of workstreams, decisions, history
Claude config + memory tar → age → Nextcloud Loss of MCP registration, project memory
Git repos Gitea remotes SPOF: Gitea runs on the same server being migrated

Decision D2: Nextcloud upload-only file drop as backup destination.

OAS Alignment

Per ADR-003, backup tooling lives in S2 (railiance-cluster). The preflight check covers all five OAS stack repos:

Repo OAS Layer
railiance-infra S1 — OS & Provisioning
railiance-cluster S2 — Kubernetes Runtime
railiance-platform S3 — Platform Services
railiance-enablement S4 — Developer Tooling
railiance-apps S5 — Workloads & Endpoints

Plus cross-domain repos: the-custodian, markitect_project, activity-core, net-kingdom, issue-facade, binect-js, kaizen-agentic.

Boundary

Backup execution: this repo (bin/railiance backup). Backup destination: Nextcloud file drop (URL in ~/.config/railiance/nc-upload-url or hardcoded). Restore procedure: docs/backup-restore.md.


Tasks

T01 — Update preflight repo list to OAS 5-repo layout

id: T01
status: done
priority: high

Update tools/cmd/railiance-preflight REPOS array: remove railiance-bootstrap, add railiance-infra, railiance-cluster, railiance-platform, railiance-enablement, railiance-apps. Add all active project repos.

Done when: bin/railiance preflight checks all current repos.


T02 — Fix stale repo references in backup-restore.md

id: T02
status: done
priority: medium

Update restore procedure: railiance-bootstraprailiance-cluster, railiance-hostsrailiance-infra, add the three new OAS repos.

Done when: doc accurately reflects the current 5-repo OAS stack.


T03 — Add make backup and make preflight targets

id: T03
status: done
priority: medium

Add to root Makefile so the safety net is discoverable from make help.

Done when: make backup and make preflight both work.


T04 — Run current backup and verify upload

id: T04
status: done
priority: high

Run bin/railiance backup and confirm both DB and config files appear in the Nextcloud file drop.

Done when: backup completes without error and .last-backup stamp is fresh.


T05 — Verify or install cron job

id: T05
status: todo
priority: medium

Confirm that the daily 02:00 cron job is installed and has run at least once:

crontab -l | grep railiance
cat ~/.cache/railiance/backup.log | tail -20

If missing, install:

(crontab -l 2>/dev/null; echo "0 2 * * * /home/worsch/railiance-cluster/bin/railiance backup >> ~/.cache/railiance/backup.log 2>&1") | crontab -

Done when: cron is listed and log shows a successful run.


T06 — Run restore drill

id: T06
status: todo
priority: medium

Run the minimal restore drill from docs/backup-restore.md against the current backup. Record completion in ~/.cache/railiance/restore-drill.log.

Done when: drill exits 0 and log entry is written.


References

  • Decision D2: Nextcloud as backup destination (DECISIONS.md)
  • Backup tooling: tools/cmd/railiance-backup, tools/cmd/railiance-preflight
  • Restore procedure: docs/backup-restore.md
  • Extension points: EP-RAIL-003 (git bare mirrors), EP-RAIL-004 (secondary offsite copy)