feat: backup + preflight commands, decisions log, gitignore update

- tools/cmd/railiance-backup: pg_dump + config snapshot, age-encrypted,
  uploaded to Nextcloud file drop via curl PUT. Daily cron target.
- tools/cmd/railiance-preflight: pre-migration safety gate — checks backup
  freshness, all repos clean/pushed, age key present.
- bin/railiance: added backup and preflight subcommands.
- DECISIONS.md: decision log (D1 ingress Nginx+Traefik, D2 Nextcloud backup).
- .gitignore: exclude *backup-dropoff-link* files (contain upload tokens).
- CLAUDE.md: state hub session protocol update.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-25 23:59:28 +01:00
parent eb8a6902b6
commit 4381a079a2
6 changed files with 200 additions and 2 deletions

3
.gitignore vendored
View File

@@ -69,6 +69,9 @@ htmlcov/
# (token + credentials must *never* be committed)
.railiance_gitea.conf
# Backup dropoff links (contain upload tokens)
*backup-dropoff-link*
# IDE configs
.vscode/
.idea/

View File

@@ -20,14 +20,22 @@ Call the tool first, then respond based on what you find.
```
cd ~/the-custodian/state-hub && make api
```
2. Check whether the `railiance` topic has any open workstreams in the summary.
2. Call `get_next_steps()` — surfaces contextual suggestions from recently resolved
decisions and cleared workstream dependencies. Act on these before starting new work.
3. Check whether the `railiance` topic has any open workstreams in the summary.
- **If workstreams exist:** review blocking decisions before starting work.
- **If no workstreams exist:** follow the First Session Protocol below.
**During work:**
- Use `create_task()` / `update_task_status()` to track concrete deliverables.
- Use `record_decision()` for any decision that affects direction or dependencies.
- Use `add_progress_event()` for notable events (milestones, blockers, insights).
- Use `resolve_decision()` to close a decision once the choice is made — this is one
of the two sanctioned write operations in the hub.
> **Design boundary:** The State Hub is a *read model*. Two write operations are
> permanently sanctioned: **Resolving Decisions** and **Suggesting Next Steps**.
> The bootstrap tools (`create_workstream`, `create_task`, `update_task_status`) are
> only for First Session Protocol. Formal work structure belongs in the domain repo.
**At the end of every session:**
- Call `add_progress_event()` with a summary of what was accomplished or decided.

21
DECISIONS.md Normal file
View File

@@ -0,0 +1,21 @@
# Decision Log
_Auto-generated by the Custodian State Hub._
## D1 — Ingress controller: Traefik (K3s default) vs Nginx for ThreePhoenix
**Date:** 2026-02-25
**Decided by:** Tegwick
I want to go with C and separate concerns. Nginx for external SSL will need security and functional updates on a completly different schedule to Traefik canary and production workload splitting. The second area of implementation is more complicated, volatile and will need time to settle.
---
## D2 — Durable offsite backup destination for single-server safety net
**Date:** 2026-02-25
**Decided by:** Tegwick
We will use cloud storage the backup should be encypted to be safe regardless of the location and provider and for starters I will provide a nextcloud upload space as a backend.
---

4
bin/railiance Normal file → Executable file
View File

@@ -19,6 +19,8 @@ Commands:
build-spore Build a distributable "Spore" bundle
seed-local Run the seed script on this machine
checklist Pre-VM checklist
backup Backup postgres + config to Nextcloud (age-encrypted)
preflight Pre-migration safety gate (must pass before cluster work)
help Show this help
EOF
}
@@ -54,5 +56,7 @@ Rent-a-VM Checklist
CK
;;
next) cat "$ROOT/QUICKSTART.md" ;;
backup) exec railiance-backup "$@" ;;
preflight) exec railiance-preflight "$@" ;;
*) echo "Unknown command: $cmd" >&2; usage; exit 2 ;;
esac

70
tools/cmd/railiance-backup Executable file
View File

@@ -0,0 +1,70 @@
#!/usr/bin/env bash
# tools/cmd/railiance-backup — backup custodian state to Nextcloud file drop
set -euo pipefail
ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
source "${ROOT}/lib/railiance-print.sh"
# ── Configuration ─────────────────────────────────────────────────────────────
AGE_PUBLIC_KEY="age1zvryunvjhvpkmasskauga2heeg0ztnte9ymgppvjge36ekumk50syr3tsz"
NC_WEBDAV_URL="https://nx4069.your-storageshare.de/public.php/webdav"
NC_TOKEN="MfTBEjcJTGS8Ywo"
PG_CONTAINER="infra-postgres-1"
PG_USER="custodian"
PG_DB="custodian"
BACKUP_DIR="${XDG_CACHE_HOME:-$HOME/.cache}/railiance/backups"
TS="$(date -u +%Y%m%dT%H%M%SZ)"
# ── Helpers ───────────────────────────────────────────────────────────────────
nc_upload() {
local file="$1" name="$2"
curl -sf -u "${NC_TOKEN}:" -T "$file" "${NC_WEBDAV_URL}/${name}" \
|| { bad "upload" "failed to upload ${name}"; exit 1; }
}
# ── Main ──────────────────────────────────────────────────────────────────────
mkdir -p "$BACKUP_DIR"
print_hdr "Railiance Backup — ${TS}"
# 1. PostgreSQL
ok "postgres" "dumping ${PG_DB}…"
TMP_DB="${BACKUP_DIR}/db-${TS}.sql.age"
docker exec "${PG_CONTAINER}" pg_dump -U "${PG_USER}" "${PG_DB}" \
| age -r "${AGE_PUBLIC_KEY}" -o "${TMP_DB}"
ok "postgres" "encrypted → $(basename "$TMP_DB")"
nc_upload "${TMP_DB}" "db-${TS}.sql.age"
ok "postgres" "uploaded to Nextcloud"
# 2. Config snapshot
ok "config" "snapshotting ~/.claude/ ~/.claude.json .gitconfig…"
TMP_CFG_TAR="${BACKUP_DIR}/config-${TS}.tar.gz"
TMP_CFG="${BACKUP_DIR}/config-${TS}.tar.gz.age"
tar -czf "${TMP_CFG_TAR}" \
-C "$HOME" \
--ignore-failed-read \
.claude \
.claude.json \
.gitconfig \
2>/dev/null || true
age -r "${AGE_PUBLIC_KEY}" -o "${TMP_CFG}" "${TMP_CFG_TAR}"
rm -f "${TMP_CFG_TAR}"
ok "config" "encrypted → $(basename "$TMP_CFG")"
nc_upload "${TMP_CFG}" "config-${TS}.tar.gz.age"
ok "config" "uploaded to Nextcloud"
# 3. Prune local cache (keep last 7 of each type)
find "${BACKUP_DIR}" -name "db-*.sql.age" | sort -r | tail -n +8 | xargs -r rm -f
find "${BACKUP_DIR}" -name "config-*.tar.gz.age" | sort -r | tail -n +8 | xargs -r rm -f
ok "cache" "local copies pruned (keep last 7)"
# 4. Write stamp (preflight reads this)
echo "${TS}" > "${BACKUP_DIR}/.last-backup"
echo
ok "done" "Backup complete — ${TS}"
echo " DB: db-${TS}.sql.age"
echo " Config: config-${TS}.tar.gz.age"
echo
echo " ⚠️ The age private key at ~/.config/age/railiance-backup.key"
echo " MUST also be stored in your password manager or written down."
echo " Without it you cannot decrypt these backups."

92
tools/cmd/railiance-preflight Executable file
View File

@@ -0,0 +1,92 @@
#!/usr/bin/env bash
# tools/cmd/railiance-preflight — pre-migration safety gate
# Exit 0 = safe to proceed. Exit 1 = do NOT touch infrastructure.
set -euo pipefail
ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
source "${ROOT}/lib/railiance-print.sh"
# ── Configuration ─────────────────────────────────────────────────────────────
BACKUP_DIR="${XDG_CACHE_HOME:-$HOME/.cache}/railiance/backups"
MAX_AGE_HOURS=24
REPOS=(
/home/worsch/issue-facade
/home/worsch/binect-js
/home/worsch/kaizen-agentic
/home/worsch/railiance-bootstrap
/home/worsch/the-custodian
/home/worsch/markitect_project
)
# ── Helpers ───────────────────────────────────────────────────────────────────
fail=0
pass() { ok "$1" "$2"; }
fail() { bad "$1" "$2"; fail=1; }
backup_age_hours() {
local file="$1"
echo $(( ( $(date +%s) - $(stat -c %Y "$file") ) / 3600 ))
}
# ── Checks ────────────────────────────────────────────────────────────────────
print_hdr "Railiance Preflight Check"
# 1. DB backup freshness
latest_db="$(find "${BACKUP_DIR}" -name "db-*.sql.age" 2>/dev/null | sort -r | head -1)"
if [[ -z "$latest_db" ]]; then
fail "db-backup" "no backup found — run: bin/railiance backup"
else
h="$(backup_age_hours "$latest_db")"
if [[ $h -lt $MAX_AGE_HOURS ]]; then
pass "db-backup" "fresh (${h}h old) — $(basename "$latest_db")"
else
fail "db-backup" "stale (${h}h old) — run: bin/railiance backup"
fi
fi
# 2. Config backup freshness
latest_cfg="$(find "${BACKUP_DIR}" -name "config-*.tar.gz.age" 2>/dev/null | sort -r | head -1)"
if [[ -z "$latest_cfg" ]]; then
fail "cfg-backup" "no backup found — run: bin/railiance backup"
else
h="$(backup_age_hours "$latest_cfg")"
if [[ $h -lt $MAX_AGE_HOURS ]]; then
pass "cfg-backup" "fresh (${h}h old)"
else
fail "cfg-backup" "stale (${h}h old) — run: bin/railiance backup"
fi
fi
# 3. Git repo state
for repo in "${REPOS[@]}"; do
name="$(basename "$repo")"
if [[ ! -d "$repo/.git" ]]; then
fail "git:${name}" "not a git repo at ${repo}"
continue
fi
if ! git -C "$repo" diff --quiet 2>/dev/null || \
! git -C "$repo" diff --cached --quiet 2>/dev/null; then
fail "git:${name}" "uncommitted changes"
elif [[ -n "$(git -C "$repo" log --oneline '@{u}..' 2>/dev/null || true)" ]]; then
fail "git:${name}" "unpushed commits"
else
pass "git:${name}" "clean and pushed"
fi
done
# 4. age key present
if [[ -f "${HOME}/.config/age/railiance-backup.key" ]]; then
pass "age-key" "present at ~/.config/age/railiance-backup.key"
else
fail "age-key" "missing — run: age-keygen -o ~/.config/age/railiance-backup.key"
fi
# ── Summary ───────────────────────────────────────────────────────────────────
echo
if [[ $fail -eq 0 ]]; then
ok "PREFLIGHT" "all checks passed — safe to proceed"
exit 0
else
bad "PREFLIGHT" "checks failed — do NOT proceed with infrastructure work"
exit 1
fi