Backup and restore drill
This commit is contained in:
53
memory/episodic/2026-05-02-state-hub-wsl2-restore-drill.md
Normal file
53
memory/episodic/2026-05-02-state-hub-wsl2-restore-drill.md
Normal file
@@ -0,0 +1,53 @@
|
||||
---
|
||||
id: 2026-05-02-state-hub-wsl2-restore-drill
|
||||
type: restore-drill
|
||||
domain: custodian
|
||||
repo: the-custodian
|
||||
workplan: CUST-WP-0011
|
||||
task: T01
|
||||
created: "2026-05-02"
|
||||
author: codex
|
||||
---
|
||||
|
||||
# State Hub WSL2 Restore Drill — 2026-05-02
|
||||
|
||||
## Summary
|
||||
|
||||
Completed the CUST-WP-0011 T01 pre-migration safety drill. A fresh SQL dump of
|
||||
the live WSL2 State Hub PostgreSQL database was restored into a disposable
|
||||
PostgreSQL 16 container and verified through the State Hub application.
|
||||
|
||||
## Source
|
||||
|
||||
- Live container: `infra-postgres-1`
|
||||
- Source database: `custodian`
|
||||
- Temporary dump artifact: `/tmp/state-hub-restore-drill/state-hub-drill.sql.gz`
|
||||
(removed after verification because it was an unencrypted drill artifact)
|
||||
- Restore container: `state-hub-restore-test`
|
||||
- Restore endpoint: `127.0.0.1:5433`
|
||||
|
||||
## Verification
|
||||
|
||||
- Restore command exited 0.
|
||||
- Production and restored table row counts matched exactly.
|
||||
- State Hub app pointed at the restored DB returned:
|
||||
- `/state/health`: HTTP 200, DB connected.
|
||||
- `/state/summary`: HTTP 200.
|
||||
|
||||
Key restored counts:
|
||||
|
||||
| Table | Rows |
|
||||
|---|---:|
|
||||
| workstreams | 117 |
|
||||
| tasks | 989 |
|
||||
| progress_events | 1423 |
|
||||
| token_events | 208 |
|
||||
| managed_repos | 19 |
|
||||
| sbom_entries | 2257 |
|
||||
|
||||
## Notes
|
||||
|
||||
State Hub does not yet have a dedicated `make backup` / `make restore` target.
|
||||
This drill used direct `pg_dump` and `psql` via Docker, which proves the data
|
||||
path but should be wrapped in first-class commands before the railiance01
|
||||
cutover.
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "Ad Hoc Tasks — 2026-05-02"
|
||||
domain: custodian
|
||||
repo: the-custodian
|
||||
status: active
|
||||
status: done
|
||||
owner: custodian
|
||||
topic_slug: custodian
|
||||
created: "2026-05-02"
|
||||
|
||||
@@ -105,9 +105,10 @@ Resolve these before T04/T05 can become live migration work:
|
||||
|
||||
```task
|
||||
id: T01
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "b0caf112-dc1d-43a8-9f27-d627dd4aa2bf"
|
||||
completed: "2026-05-02"
|
||||
```
|
||||
|
||||
Take a fresh State Hub backup from the current WSL2 instance and restore it
|
||||
@@ -124,6 +125,12 @@ Minimum checks:
|
||||
**Done when:** backup and restore are proven within 24 hours of live migration
|
||||
work.
|
||||
|
||||
Result: completed 2026-05-02. A fresh dump from `infra-postgres-1` restored
|
||||
into disposable container `state-hub-restore-test` on `127.0.0.1:5433`.
|
||||
Application health and summary checks against the restored database returned
|
||||
HTTP 200. Restored row counts matched production exactly, including 117
|
||||
workstreams, 989 tasks, 1423 progress events, and 208 token events.
|
||||
|
||||
---
|
||||
|
||||
### T02 — Align with Railiance deployment plan
|
||||
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "State Hub Full ThreePhoenix HA Migration"
|
||||
domain: custodian
|
||||
repo: the-custodian
|
||||
status: proposed
|
||||
status: active
|
||||
owner: custodian
|
||||
topic_slug: custodian
|
||||
created: "2026-05-02"
|
||||
|
||||
Reference in New Issue
Block a user