Commit Graph

2 Commits

Author SHA1 Message Date
359d5b8b5b bug(gitea): report pgpool CrashLoopBackOff on HA failover + D3 testing policy
Some checks failed
railiance-tests / smoke (push) Has been cancelled
Add RAIL-BS-WP-0003 documenting the 2026-03-10 incident where a PostgreSQL
HA failover caused pgpool to enter CrashLoopBackOff due to a missing
pgpool-password key in the gitea-postgresql-ha-postgresql secret — a bug
present since initial deployment but hidden by the lack of any pod restart.

Add Decision D3: HA and failover scenarios must be tested before a workplan
is considered done. Any HA component deployment requires a passing failover
test script in tests/ and complete Helm values before status = completed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-10 13:03:36 +00:00
4381a079a2 feat: backup + preflight commands, decisions log, gitignore update
- tools/cmd/railiance-backup: pg_dump + config snapshot, age-encrypted,
  uploaded to Nextcloud file drop via curl PUT. Daily cron target.
- tools/cmd/railiance-preflight: pre-migration safety gate — checks backup
  freshness, all repos clean/pushed, age key present.
- bin/railiance: added backup and preflight subcommands.
- DECISIONS.md: decision log (D1 ingress Nginx+Traefik, D2 Nextcloud backup).
- .gitignore: exclude *backup-dropoff-link* files (contain upload tokens).
- CLAUDE.md: state hub session protocol update.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 23:59:28 +01:00