New workplan to migrate from gitea to forgejo
This commit is contained in:
483
workplans/RAIL-HO-WP-0005-forgejo-production-migration.md
Normal file
483
workplans/RAIL-HO-WP-0005-forgejo-production-migration.md
Normal file
@@ -0,0 +1,483 @@
|
||||
---
|
||||
id: RAIL-HO-WP-0005
|
||||
type: workplan
|
||||
title: "Forgejo Production Migration on railiance01"
|
||||
domain: railiance
|
||||
repo: railiance-infra
|
||||
status: active
|
||||
owner: railiance
|
||||
topic_slug: railiance
|
||||
created: "2026-05-03"
|
||||
updated: "2026-05-03"
|
||||
state_hub_workstream_id: "84e17675-0d15-4268-a8bd-540124d37018"
|
||||
---
|
||||
|
||||
# Forgejo Production Migration on railiance01
|
||||
|
||||
## Goal
|
||||
|
||||
Establish Forgejo as the production-grade source forge and package base for
|
||||
Railiance, then migrate all repositories and workflows currently relying on
|
||||
Gitea to the new Forgejo installation.
|
||||
|
||||
Forgejo will become the heart of Railiance infrastructure. The work must be
|
||||
fully automated, backup-backed, recovery-drilled, and suitable for long-lived
|
||||
operation on railiance01 before any production cutover happens.
|
||||
|
||||
## Placement in the Railiance Tooling Set
|
||||
|
||||
This workplan lives in `railiance-infra` because it is the cross-layer
|
||||
production infrastructure coordination plan and belongs next to
|
||||
`RAIL-HO-WP-0004-production-readiness.md`.
|
||||
|
||||
Implementation must respect the OAS repo boundaries:
|
||||
|
||||
| Concern | Repo | Layer |
|
||||
|---------|------|-------|
|
||||
| Server prerequisites, inventory, OS packages, SSH/system users | `railiance-infra` | S1 |
|
||||
| k3s runtime prerequisites, namespaces, ingress class, cluster backup hooks | `railiance-cluster` | S2 |
|
||||
| PostgreSQL, object storage, backup targets, registry storage dependencies | `railiance-platform` | S3 |
|
||||
| Forgejo Actions runner templates, CI conventions, migration automation | `railiance-enablement` | S4 |
|
||||
| Forgejo Helm release, app config, mail config, package registry, app backups | `railiance-apps` | S5 |
|
||||
|
||||
This file is the umbrella plan. If an implementation step requires files in a
|
||||
different repo, that repo should receive its own workplan or task before the
|
||||
change is made there.
|
||||
|
||||
## Key Decisions to Confirm
|
||||
|
||||
1. Public/private hostname for Forgejo and whether Gitea remains reachable
|
||||
during the transition.
|
||||
2. Mail delivery path for password reset and account recovery
|
||||
(SMTP relay, sender domain, SPF/DKIM/DMARC expectations).
|
||||
3. Package registry scope: container images only at first, or also generic,
|
||||
npm, PyPI, Go, Maven, and Helm packages.
|
||||
4. Actions runner model: in-cluster ephemeral runners, long-lived runner pod,
|
||||
or isolated host runner.
|
||||
5. Backup destination and retention target for database, repositories,
|
||||
attachments, LFS, Actions artifacts/logs, and package data.
|
||||
6. Cutover mode: freeze-and-migrate all repos in one window, or staged
|
||||
project-by-project transition.
|
||||
|
||||
## Safety Contract
|
||||
|
||||
- Gitea remains the production source of truth until Forgejo restore and
|
||||
migration drills pass.
|
||||
- No repository is deleted from Gitea during this workplan.
|
||||
- A fresh Gitea backup must be taken before every migration drill and before
|
||||
final cutover.
|
||||
- Forgejo backups must be restored into an isolated namespace before accepting
|
||||
production use.
|
||||
- Password reset and email recovery must be verified with a real controlled
|
||||
account before onboarding users.
|
||||
- Forgejo Actions may not receive broad cluster credentials by default; runner
|
||||
permissions must be least-privilege and repo-scoped where practical.
|
||||
- Secrets stay in SOPS/age or Kubernetes Secrets managed by the appropriate
|
||||
repo. No plaintext SMTP passwords, admin tokens, runner tokens, or registry
|
||||
credentials in Git.
|
||||
|
||||
## Probe Strategy
|
||||
|
||||
A `forgejo-railiance-probe` is reasonable and should be treated as a disposable
|
||||
S5/S4 integration probe, not as the production install.
|
||||
|
||||
The probe should prove:
|
||||
|
||||
- Helm values and cnpg database wiring converge cleanly.
|
||||
- Initial admin bootstrap is automated and repeatable.
|
||||
- SMTP/password reset works end-to-end.
|
||||
- Package registry endpoints work for the package types Railiance needs first.
|
||||
- Forgejo Actions can run a minimal workflow and publish a test package.
|
||||
- Backup and restore works in an isolated namespace.
|
||||
- Migration from a sample Gitea repo preserves git history, issues, releases,
|
||||
wiki, LFS or attachments where applicable.
|
||||
|
||||
The probe is destroyed or explicitly archived after production Forgejo is live.
|
||||
|
||||
## Target Architecture
|
||||
|
||||
```
|
||||
operator / agents / developers
|
||||
-> private HTTPS endpoint
|
||||
-> railiance01 ingress
|
||||
-> forgejo Service in forgejo namespace
|
||||
-> Forgejo Deployment/StatefulSet
|
||||
-> forgejo-db CloudNative PG Cluster in databases namespace
|
||||
-> Valkey/cache if required
|
||||
-> persistent storage for repositories, attachments, LFS, packages
|
||||
-> Actions runner(s) with restricted execution scope
|
||||
-> backup jobs to the approved backup target
|
||||
```
|
||||
|
||||
## Tasks
|
||||
|
||||
### T01 — Inventory current Gitea functionality and migration requirements
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T01
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "cf59d171-5629-45c9-9d44-8d6499827ffc"
|
||||
```
|
||||
|
||||
Create a source-of-truth inventory of current Gitea usage.
|
||||
|
||||
Minimum inventory:
|
||||
|
||||
- All repositories in the `coulomb` organization.
|
||||
- Registered vs unregistered State Hub repos.
|
||||
- Users, organizations, teams, deploy keys, SSH keys, access tokens.
|
||||
- Issues, labels, milestones, releases, wiki, packages, LFS, attachments.
|
||||
- Existing webhook usage and automation assumptions.
|
||||
- Current Gitea package registry status and the missing `[packages]` config
|
||||
that is blocking container image publication.
|
||||
|
||||
**Done when:** the inventory identifies every feature that must work in
|
||||
Forgejo before cutover and classifies each migration item as automatic,
|
||||
manual, unsupported, or explicitly out of scope.
|
||||
|
||||
---
|
||||
|
||||
### T02 — Resolve Forgejo production design decisions
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T02
|
||||
status: todo
|
||||
priority: high
|
||||
needs_human: true
|
||||
state_hub_task_id: "f88115bf-4f99-49ef-a415-0b23750141b3"
|
||||
```
|
||||
|
||||
Decide the production choices listed in "Key Decisions to Confirm".
|
||||
|
||||
Expected output:
|
||||
|
||||
- A short decision record in this workplan or a dedicated ADR.
|
||||
- Hostname and exposure model.
|
||||
- SMTP provider and sender identity.
|
||||
- Package registry scope.
|
||||
- Actions runner isolation model.
|
||||
- Backup target, retention, encryption, and restore cadence.
|
||||
- Cutover strategy and rollback window.
|
||||
|
||||
**Done when:** implementation tasks are no longer blocked by open production
|
||||
choices.
|
||||
|
||||
---
|
||||
|
||||
### T03 — Build forgejo-railiance-probe
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T03
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "b516018a-415e-4a58-8c62-07c14ece9353"
|
||||
```
|
||||
|
||||
Create a disposable probe environment for Forgejo before touching production.
|
||||
|
||||
Expected repo ownership:
|
||||
|
||||
- `railiance-platform`: probe cnpg database and storage dependencies.
|
||||
- `railiance-apps`: probe Forgejo Helm values and namespace.
|
||||
- `railiance-enablement`: probe Actions runner template and workflows.
|
||||
|
||||
Probe acceptance:
|
||||
|
||||
- `make forgejo-probe-deploy` or equivalent converges from a clean cluster
|
||||
state.
|
||||
- Admin bootstrap is automated.
|
||||
- A test user can reset a password via email.
|
||||
- A test repository can be created, cloned, pushed, and protected.
|
||||
- A test package can be published and pulled.
|
||||
- A test Forgejo Actions workflow runs successfully.
|
||||
- A probe backup restores into an isolated namespace.
|
||||
|
||||
**Done when:** the probe demonstrates the whole lifecycle without manual
|
||||
cluster surgery.
|
||||
|
||||
---
|
||||
|
||||
### T04 — Define Forgejo platform services
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T04
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "28b351fe-bfbe-4a8b-bbfa-1b148e69f8e0"
|
||||
```
|
||||
|
||||
In `railiance-platform`, define production platform services for Forgejo.
|
||||
|
||||
Minimum scope:
|
||||
|
||||
- `forgejo-db` CloudNative PG cluster.
|
||||
- Database credentials via SOPS-managed Secret or approved secret flow.
|
||||
- Backup configuration for database base backups and WAL archiving.
|
||||
- Object storage or persistent volume plan for repositories, attachments, LFS,
|
||||
packages, Actions artifacts, and logs.
|
||||
- Restore runbook for database and blob/package data.
|
||||
|
||||
**Done when:** platform dependencies can be deployed and restored without the
|
||||
Forgejo app running.
|
||||
|
||||
---
|
||||
|
||||
### T05 — Define production Forgejo application deployment
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T05
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "11540ba4-d31c-4f64-836b-c6de69107aa4"
|
||||
```
|
||||
|
||||
In `railiance-apps`, create the production Forgejo deployment.
|
||||
|
||||
Minimum scope:
|
||||
|
||||
- Forgejo Helm release or manifests in the S5 boundary.
|
||||
- App configuration for database, SSH, HTTPS, mailer, packages, LFS, and
|
||||
security settings.
|
||||
- Initial admin/user bootstrap that is automated but does not commit secrets.
|
||||
- Health/status targets in the Makefile.
|
||||
- Migration-safe configuration for coexistence with Gitea during the cutover.
|
||||
|
||||
**Done when:** Forgejo runs on railiance01 against production platform
|
||||
services and can serve login, git clone/push, package registry, and admin
|
||||
operations.
|
||||
|
||||
---
|
||||
|
||||
### T06 — Implement usable email recovery cycle
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T06
|
||||
status: todo
|
||||
priority: high
|
||||
needs_human: true
|
||||
state_hub_task_id: "417faa4d-eab8-4247-9485-4f80e5d5b7ff"
|
||||
```
|
||||
|
||||
Configure and test mail delivery for account recovery.
|
||||
|
||||
Minimum scope:
|
||||
|
||||
- SMTP credentials stored through the approved secret path.
|
||||
- Sender address and domain alignment documented.
|
||||
- Password reset email works for a controlled non-admin account.
|
||||
- Account recovery runbook covers lost password, lost MFA, disabled account,
|
||||
and emergency admin access.
|
||||
- Mail failure is observable through logs or a health check.
|
||||
|
||||
**Done when:** a user can complete password recovery without operator database
|
||||
edits, and the operator has a documented emergency path.
|
||||
|
||||
---
|
||||
|
||||
### T07 — Enable and harden package registry base
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T07
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "9578f672-e2b8-43a3-8419-5f86f8871326"
|
||||
```
|
||||
|
||||
Enable Forgejo packages for Railiance's near-term build and deployment needs.
|
||||
|
||||
Initial package types:
|
||||
|
||||
- Container registry for State Hub and future app images.
|
||||
- Generic packages for release artifacts.
|
||||
- Additional package types only after the inventory proves they are needed.
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Authenticated push and pull works from operator workstation and railiance01.
|
||||
- Container image pull works from k3s deployments.
|
||||
- Retention and cleanup expectations are documented.
|
||||
- Package data is included in backup and restore drills.
|
||||
|
||||
**Done when:** `state-hub` or a probe image can be published to Forgejo and
|
||||
pulled by railiance01.
|
||||
|
||||
---
|
||||
|
||||
### T08 — Enable Forgejo Actions
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T08
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "f45f98c9-2f02-4224-bbfd-c2e1ec38581e"
|
||||
```
|
||||
|
||||
Enable Forgejo Actions with a least-privilege runner model.
|
||||
|
||||
Minimum scope:
|
||||
|
||||
- Runner registration automated without committing runner tokens.
|
||||
- Runner isolation model documented.
|
||||
- Minimal workflows for lint/test/build on representative repositories.
|
||||
- Workflow to build and publish a probe container image to Forgejo packages.
|
||||
- Secret handling policy for Actions.
|
||||
- Resource limits to avoid repeating previous single-node overload patterns.
|
||||
|
||||
**Done when:** a representative repository can run Forgejo Actions and publish
|
||||
a test artifact without privileged cluster-wide credentials.
|
||||
|
||||
---
|
||||
|
||||
### T09 — Implement Forgejo backup and restore automation
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T09
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "25892007-36ca-4bd9-8adf-84d505465d7d"
|
||||
```
|
||||
|
||||
Create backup automation for all Forgejo state.
|
||||
|
||||
Must cover:
|
||||
|
||||
- PostgreSQL database.
|
||||
- Git repositories.
|
||||
- Attachments.
|
||||
- LFS.
|
||||
- Packages.
|
||||
- Avatars and app data.
|
||||
- Actions logs/artifacts if retained.
|
||||
- App configuration required for restore.
|
||||
|
||||
Acceptance:
|
||||
|
||||
- Scheduled backups run without manual intervention.
|
||||
- Backups are encrypted or stored in an approved protected target.
|
||||
- Restore into an isolated namespace is drilled and documented.
|
||||
- RPO/RTO expectations are recorded.
|
||||
|
||||
**Done when:** a fresh backup restores to a working isolated Forgejo instance
|
||||
with repository, package, and user recovery checks passing.
|
||||
|
||||
---
|
||||
|
||||
### T10 — Drill Gitea to Forgejo migration
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T10
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: "6befde73-00bc-4643-be0b-a7ce7944e75f"
|
||||
```
|
||||
|
||||
Run a non-production migration drill from Gitea to Forgejo.
|
||||
|
||||
Minimum checks:
|
||||
|
||||
- Git history and default branches preserved.
|
||||
- Issues, labels, milestones, releases, wiki, and attachments handled per
|
||||
inventory classification.
|
||||
- SSH/HTTPS clone and push paths work.
|
||||
- Existing local remotes can be transformed predictably.
|
||||
- State Hub registered repo remotes can be updated safely.
|
||||
- Rollback plan is rehearsed.
|
||||
|
||||
**Done when:** a sample migration has a written result matrix and no unknown
|
||||
critical migration gaps remain.
|
||||
|
||||
---
|
||||
|
||||
### T11 — Production cutover from Gitea to Forgejo
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T11
|
||||
status: todo
|
||||
priority: high
|
||||
needs_human: true
|
||||
state_hub_task_id: "b1b66687-ca33-4971-b312-743c8e059c5e"
|
||||
```
|
||||
|
||||
Execute the production migration only after the probe, backup restore, package
|
||||
registry, email recovery, and Actions gates pass.
|
||||
|
||||
Cutover sequence:
|
||||
|
||||
1. Announce freeze window.
|
||||
2. Take final Gitea backup and verify it exists.
|
||||
3. Freeze Gitea writes.
|
||||
4. Migrate repositories and metadata to Forgejo.
|
||||
5. Validate critical repositories and package pulls.
|
||||
6. Update State Hub repo remotes and host paths as needed.
|
||||
7. Update local and railiance01 remotes.
|
||||
8. Keep Gitea read-only as rollback until the stabilization window passes.
|
||||
|
||||
**Done when:** all Railiance/Custodian repos use Forgejo as primary, Gitea is
|
||||
read-only fallback, and rollback instructions are documented.
|
||||
|
||||
---
|
||||
|
||||
### T12 — Retire or archive legacy Gitea
|
||||
|
||||
```task
|
||||
id: RAIL-HO-WP-0005-T12
|
||||
status: todo
|
||||
priority: medium
|
||||
needs_human: true
|
||||
state_hub_task_id: "a63147b0-31d5-4705-89ea-40c10faf779f"
|
||||
```
|
||||
|
||||
Retire legacy Gitea only after a stabilization period and explicit approval.
|
||||
|
||||
Minimum scope:
|
||||
|
||||
- Confirm no active remotes, webhooks, packages, or dashboards depend on Gitea.
|
||||
- Preserve final Gitea backup.
|
||||
- Update runbooks and dashboards from Gitea to Forgejo.
|
||||
- Remove or archive Gitea Helm release according to the rollback decision.
|
||||
- Close stale State Hub references to `railiance-bootstrap` if confirmed as
|
||||
an alias rather than a real repo.
|
||||
|
||||
**Done when:** Forgejo is the only active source forge and package base, with
|
||||
legacy Gitea either archived or intentionally retained as documented fallback.
|
||||
|
||||
## Phasing and Dependencies
|
||||
|
||||
```
|
||||
T01 inventory ─┬─► T02 decisions ─┬─► T03 probe ─┬─► T04 platform
|
||||
│ │ ├─► T05 app
|
||||
│ │ ├─► T06 mail recovery
|
||||
│ │ ├─► T07 packages
|
||||
│ │ ├─► T08 actions
|
||||
│ │ └─► T09 backups
|
||||
└────────────────────────────────────► T10 migration drill
|
||||
|
||||
T03-T10 all pass ─► T11 production cutover ─► T12 legacy Gitea retirement
|
||||
```
|
||||
|
||||
Recommended first slice: T01, T02, T03. Do not start T11 until T06, T07, T08,
|
||||
T09, and T10 are complete.
|
||||
|
||||
## railiance-bootstrap Note
|
||||
|
||||
State Hub currently registers both `railiance-bootstrap` and
|
||||
`railiance-cluster`, but they point to the same local path
|
||||
(`/home/worsch/railiance-cluster`) and the same git fingerprint. The
|
||||
`railiance-bootstrap` entry has no remote URL. The earlier restructure workplan
|
||||
(`RAIL-HO-WP-0003-T03`) says `railiance-bootstrap` was renamed to
|
||||
`railiance-cluster`.
|
||||
|
||||
Working assumption: `railiance-bootstrap` is a stale logical alias or leftover
|
||||
repo goal, not a separate Gitea repository. This workplan should not create a
|
||||
new Forgejo repository named `railiance-bootstrap` unless a concrete remaining
|
||||
purpose is identified.
|
||||
|
||||
## References
|
||||
|
||||
- `RAIL-HO-WP-0004-production-readiness.md`
|
||||
- `RAIL-HO-WP-0003-5repo-stack-restructure.md`
|
||||
- `CUST-WP-0014-repo-sync-automation.md`
|
||||
- `CUST-WP-0021-multi-host-repo-paths.md`
|
||||
- `ops/incidents/2026-03-25-gitea-pgpool-crashloop.md`
|
||||
- `ops/incidents/2026-03-26-coulombcore-runaway-agent-overload.md`
|
||||
Reference in New Issue
Block a user