Files
railiance-infra/docs/forgejo-restore-drill-evidence.md
tegwick 092315895f RAIL-HO-WP-0005-T09: Forgejo backup/restore drill assets and evidence
Add isolated-namespace restore drill (CNPG cluster, PVC, orchestration script)
and document successful 2026-07-04 run: production forgejo dump restored with
health 200 and pilot repos visible via API. Scheduled backups remain open.
2026-07-04 11:26:50 +02:00

3.3 KiB
Raw Blame History

Forgejo Backup/Restore Drill Evidence

Date: 2026-07-04
Workplan: RAIL-HO-WP-0005
Task: RAIL-HO-WP-0005-T09
no_secret_material_recorded: true

Purpose

Prove that a production forgejo dump can be restored into an isolated namespace and serve repository metadata without touching production Forgejo or Gitea.

Backup source

Field Value
Method forgejo dump from production pod
Production pod forgejo-gitea-64c5b57684-ph9vt (namespace forgejo)
Archive path (workstation) /tmp/forgejo-drill/forgejo-drill-backup.zip
Archive size 12,284,847 bytes (~11.7 MiB)
Archive timestamp 2026-07-04 11:20 +0200
Archive contents (top-level) repos/, data/, forgejo-db.sql, app.ini

Repos present in dump: forgejo-actions-probe, glas-harness, key-cape (all under repos/coulomb/).

Restore target

Field Value
Namespace forgejo-restore-drill
Database CNPG cluster forgejo-db-restore (isolated, 1 instance)
App data PVC forgejo-restore-data (local-path, 10Gi)
Helm release forgejo-restore (gitea-charts/gitea 12.5.0)
Orchestration tools/forgejo-restore-drill.sh

Restore path (Forgejo 11.0.3 has no forgejo restore CLI):

  1. Unzip dump into import pod staging area.
  2. Copy repos//data/git/gitea-repositories/.
  3. Copy data//data/ (packages, attachments, avatars).
  4. Import forgejo-db.sql via psql into forgejo-db-restore.
  5. Deploy isolated Helm release bound to restored PVC + restore DB host.

Post-restore checks (2026-07-04)

Port-forward: svc/forgejo-restore-gitea-http127.0.0.1:13000

Check Result
GET / health HTTP 200
GET /api/v1/repos/coulomb/glas-harness full_name=coulomb/glas-harness, default_branch=main
GET /api/v1/repos/coulomb/key-cape full_name=coulomb/key-cape, default_branch=main
GET /api/v1/orgs/coulomb/repos 3 repos: forgejo-actions-probe, glas-harness, key-cape

Script exit marker: restore-drill-complete

RPO / RTO (drill scope)

Metric Observed / assumed
RPO (manual dump) Point-in-time of forgejo dump execution; no scheduled backup yet
RTO (isolated restore) ~35 minutes for CNPG ready + import + Helm deploy on railiance01
Production impact None — read-only dump from running pod; separate namespace

Gaps (not closed by this drill)

  • Scheduled backups: CNPG Backup CRs and off-cluster target not configured (kubectl cnpg plugin absent on workstation).
  • Encryption at rest: dump stored locally on workstation for drill only; no approved backup target wired.
  • Automation: forgejo dump is manual; T04/T09 still need cron/operator schedule and retention policy (T02 decision).
  • Re-run hygiene: concurrent or repeat runs require DRILL_CLEAN=1 to wipe forgejo-restore-drill before import (SQL import is not idempotent).

Cleanup

After evidence capture, delete the drill namespace:

kubectl delete namespace forgejo-restore-drill --wait=true

Production Forgejo (forgejo namespace) and Gitea remain unchanged.

References

  • infra/forgejo-restore-drill/forgejo-db-restore-cluster.yaml
  • infra/forgejo-restore-drill/restore-job.yaml
  • tools/forgejo-restore-drill.sh
  • workplans/RAIL-HO-WP-0005-forgejo-production-migration.md (T09)