3 Commits

Author SHA1 Message Date
5b0cfbf10a feat(backup): revise WP-0004 — integrated backup per capability (D4)
Some checks failed
railiance-tests / smoke (push) Has been cancelled
WP-0004 rewritten: scope narrowed to S2-owned assets (etcd snapshots,
Helm values, kubeconfig). No external dependencies. age encryption
reuses SOPS key pair. Output to /opt/backup/railiance/cluster/.

DECISIONS.md D4: integrated backup per capability, not centralized.
EP-RAIL-005 registered in state hub: custodian orchestration deferred
until all layers implement the standard interface.

The old monolithic backup (custodian DB + operator config) was not S2's
concern and has been removed from this workplan scope.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 17:43:30 +01:00
359d5b8b5b bug(gitea): report pgpool CrashLoopBackOff on HA failover + D3 testing policy
Some checks failed
railiance-tests / smoke (push) Has been cancelled
Add RAIL-BS-WP-0003 documenting the 2026-03-10 incident where a PostgreSQL
HA failover caused pgpool to enter CrashLoopBackOff due to a missing
pgpool-password key in the gitea-postgresql-ha-postgresql secret — a bug
present since initial deployment but hidden by the lack of any pod restart.

Add Decision D3: HA and failover scenarios must be tested before a workplan
is considered done. Any HA component deployment requires a passing failover
test script in tests/ and complete Helm values before status = completed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 13:03:36 +00:00
4381a079a2 feat: backup + preflight commands, decisions log, gitignore update
- tools/cmd/railiance-backup: pg_dump + config snapshot, age-encrypted,
  uploaded to Nextcloud file drop via curl PUT. Daily cron target.
- tools/cmd/railiance-preflight: pre-migration safety gate — checks backup
  freshness, all repos clean/pushed, age key present.
- bin/railiance: added backup and preflight subcommands.
- DECISIONS.md: decision log (D1 ingress Nginx+Traefik, D2 Nextcloud backup).
- .gitignore: exclude *backup-dropoff-link* files (contain upload tokens).
- CLAUDE.md: state hub session protocol update.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 23:59:28 +01:00