Add isolated-namespace restore drill (CNPG cluster, PVC, orchestration script) and document successful 2026-07-04 run: production forgejo dump restored with health 200 and pilot repos visible via API. Scheduled backups remain open.
3.3 KiB
3.3 KiB
Forgejo Backup/Restore Drill Evidence
Date: 2026-07-04
Workplan: RAIL-HO-WP-0005
Task: RAIL-HO-WP-0005-T09
no_secret_material_recorded: true
Purpose
Prove that a production forgejo dump can be restored into an isolated
namespace and serve repository metadata without touching production Forgejo or
Gitea.
Backup source
| Field | Value |
|---|---|
| Method | forgejo dump from production pod |
| Production pod | forgejo-gitea-64c5b57684-ph9vt (namespace forgejo) |
| Archive path (workstation) | /tmp/forgejo-drill/forgejo-drill-backup.zip |
| Archive size | 12,284,847 bytes (~11.7 MiB) |
| Archive timestamp | 2026-07-04 11:20 +0200 |
| Archive contents (top-level) | repos/, data/, forgejo-db.sql, app.ini |
Repos present in dump: forgejo-actions-probe, glas-harness, key-cape
(all under repos/coulomb/).
Restore target
| Field | Value |
|---|---|
| Namespace | forgejo-restore-drill |
| Database | CNPG cluster forgejo-db-restore (isolated, 1 instance) |
| App data PVC | forgejo-restore-data (local-path, 10Gi) |
| Helm release | forgejo-restore (gitea-charts/gitea 12.5.0) |
| Orchestration | tools/forgejo-restore-drill.sh |
Restore path (Forgejo 11.0.3 has no forgejo restore CLI):
- Unzip dump into import pod staging area.
- Copy
repos/→/data/git/gitea-repositories/. - Copy
data/→/data/(packages, attachments, avatars). - Import
forgejo-db.sqlviapsqlintoforgejo-db-restore. - Deploy isolated Helm release bound to restored PVC + restore DB host.
Post-restore checks (2026-07-04)
Port-forward: svc/forgejo-restore-gitea-http → 127.0.0.1:13000
| Check | Result |
|---|---|
GET / health |
HTTP 200 |
GET /api/v1/repos/coulomb/glas-harness |
full_name=coulomb/glas-harness, default_branch=main |
GET /api/v1/repos/coulomb/key-cape |
full_name=coulomb/key-cape, default_branch=main |
GET /api/v1/orgs/coulomb/repos |
3 repos: forgejo-actions-probe, glas-harness, key-cape |
Script exit marker: restore-drill-complete
RPO / RTO (drill scope)
| Metric | Observed / assumed |
|---|---|
| RPO (manual dump) | Point-in-time of forgejo dump execution; no scheduled backup yet |
| RTO (isolated restore) | ~3–5 minutes for CNPG ready + import + Helm deploy on railiance01 |
| Production impact | None — read-only dump from running pod; separate namespace |
Gaps (not closed by this drill)
- Scheduled backups: CNPG
BackupCRs and off-cluster target not configured (kubectl cnpgplugin absent on workstation). - Encryption at rest: dump stored locally on workstation for drill only; no approved backup target wired.
- Automation:
forgejo dumpis manual; T04/T09 still need cron/operator schedule and retention policy (T02 decision). - Re-run hygiene: concurrent or repeat runs require
DRILL_CLEAN=1to wipeforgejo-restore-drillbefore import (SQL import is not idempotent).
Cleanup
After evidence capture, delete the drill namespace:
kubectl delete namespace forgejo-restore-drill --wait=true
Production Forgejo (forgejo namespace) and Gitea remain unchanged.
References
infra/forgejo-restore-drill/forgejo-db-restore-cluster.yamlinfra/forgejo-restore-drill/restore-job.yamltools/forgejo-restore-drill.shworkplans/RAIL-HO-WP-0005-forgejo-production-migration.md(T09)