Backup and restore drill

This commit is contained in:
2026-05-02 23:56:20 +02:00
parent d152881198
commit 49696cb0c2
4 changed files with 63 additions and 3 deletions

View File

@@ -0,0 +1,53 @@
---
id: 2026-05-02-state-hub-wsl2-restore-drill
type: restore-drill
domain: custodian
repo: the-custodian
workplan: CUST-WP-0011
task: T01
created: "2026-05-02"
author: codex
---
# State Hub WSL2 Restore Drill — 2026-05-02
## Summary
Completed the CUST-WP-0011 T01 pre-migration safety drill. A fresh SQL dump of
the live WSL2 State Hub PostgreSQL database was restored into a disposable
PostgreSQL 16 container and verified through the State Hub application.
## Source
- Live container: `infra-postgres-1`
- Source database: `custodian`
- Temporary dump artifact: `/tmp/state-hub-restore-drill/state-hub-drill.sql.gz`
(removed after verification because it was an unencrypted drill artifact)
- Restore container: `state-hub-restore-test`
- Restore endpoint: `127.0.0.1:5433`
## Verification
- Restore command exited 0.
- Production and restored table row counts matched exactly.
- State Hub app pointed at the restored DB returned:
- `/state/health`: HTTP 200, DB connected.
- `/state/summary`: HTTP 200.
Key restored counts:
| Table | Rows |
|---|---:|
| workstreams | 117 |
| tasks | 989 |
| progress_events | 1423 |
| token_events | 208 |
| managed_repos | 19 |
| sbom_entries | 2257 |
## Notes
State Hub does not yet have a dedicated `make backup` / `make restore` target.
This drill used direct `pg_dump` and `psql` via Docker, which proves the data
path but should be wrapped in first-class commands before the railiance01
cutover.

View File

@@ -4,7 +4,7 @@ type: workplan
title: "Ad Hoc Tasks — 2026-05-02"
domain: custodian
repo: the-custodian
status: active
status: done
owner: custodian
topic_slug: custodian
created: "2026-05-02"

View File

@@ -105,9 +105,10 @@ Resolve these before T04/T05 can become live migration work:
```task
id: T01
status: todo
status: done
priority: high
state_hub_task_id: "b0caf112-dc1d-43a8-9f27-d627dd4aa2bf"
completed: "2026-05-02"
```
Take a fresh State Hub backup from the current WSL2 instance and restore it
@@ -124,6 +125,12 @@ Minimum checks:
**Done when:** backup and restore are proven within 24 hours of live migration
work.
Result: completed 2026-05-02. A fresh dump from `infra-postgres-1` restored
into disposable container `state-hub-restore-test` on `127.0.0.1:5433`.
Application health and summary checks against the restored database returned
HTTP 200. Restored row counts matched production exactly, including 117
workstreams, 989 tasks, 1423 progress events, and 208 token events.
---
### T02 — Align with Railiance deployment plan

View File

@@ -4,7 +4,7 @@ type: workplan
title: "State Hub Full ThreePhoenix HA Migration"
domain: custodian
repo: the-custodian
status: proposed
status: active
owner: custodian
topic_slug: custodian
created: "2026-05-02"