Record ADR-004 in-cluster Forgejo runner decision for T04
Updates forgejo-production-decisions and CUST-WP-0054-T04 partial progress.
This commit is contained in:
61
docs/forgejo-production-decisions.md
Normal file
61
docs/forgejo-production-decisions.md
Normal file
@@ -0,0 +1,61 @@
|
|||||||
|
# Forgejo Production Decisions (Wave 1)
|
||||||
|
|
||||||
|
Date: 2026-07-03
|
||||||
|
Workplans: `RAIL-HO-WP-0005-T02`, `CUST-WP-0054-T04`
|
||||||
|
Operator input: 2026-07-03
|
||||||
|
|
||||||
|
## Decision log
|
||||||
|
|
||||||
|
| # | Topic | Decision | Status | Evidence (2026-07-03) |
|
||||||
|
| --- | --- | --- | --- | --- |
|
||||||
|
| 1 | **Production hostname** | `forgejo.coulomb.social` | **decided** | DNS A → `92.205.62.239` (railiance01); HTTPS reaches Traefik on railiance01 |
|
||||||
|
| 2 | Exposure model | Private HTTPS via railiance01 Traefik ingress + cert-manager `letsencrypt-prod` | **decided** | Same pattern as Gitea (`manifests/forgejo-ingress.yaml` in `railiance-apps`) |
|
||||||
|
| 2b | Deployment pattern | `gitea-charts/gitea` **12.5.0** Helm + Forgejo image; CNPG `forgejo-db` in `railiance-platform`; Makefile in `railiance-apps` | **decided** | Chart 12.6+ requires Gitea 1.26 `config edit-ini` (incompatible with Forgejo 11); see `railiance-apps/docs/forgejo-on-railiance01.md` |
|
||||||
|
| 2c | Live deploy | Forgejo pod + ingress + TLS on railiance01 | **done** (2026-07-03) | `make forgejo-smoke` → HTTP 200 + OCI `/v2/` 401 challenge; cert `forgejo-tls` Ready |
|
||||||
|
| 3 | Gitea during transition | `gitea.coulomb.social` on coulombcore remains canonical **until** Forgejo restore/migration drills pass; then read-only mirror | unchanged | Per `RAIL-HO-WP-0005` safety contract |
|
||||||
|
| 4 | SMTP / password reset | TBD | open | — |
|
||||||
|
| 5 | Package registry scope | TBD (container images first assumed) | open | — |
|
||||||
|
| 6 | Actions runner model | **In-cluster** on railiance01: `forgejo-runner` Deployment + DinD (`railiance01-build-01`) | **decided** | `railiance-infra/docs/adr/ADR-004-forgejo-in-cluster-actions-runner.md`; manifests in `railiance-apps/manifests/forgejo-runner.yaml` |
|
||||||
|
| 7 | Backup target + retention | TBD | open | — |
|
||||||
|
| 8 | Cutover mode | TBD (staged per-repo vs freeze-all) | open | — |
|
||||||
|
|
||||||
|
## Hostname decision detail
|
||||||
|
|
||||||
|
**Chosen hostname:** `https://forgejo.coulomb.social`
|
||||||
|
|
||||||
|
| Field | Value |
|
||||||
|
| --- | --- |
|
||||||
|
| DNS | `forgejo.coulomb.social` → `92.205.62.239` (railiance01) |
|
||||||
|
| Edge | railiance01 k3s Traefik (`kube-system/traefik` LoadBalancer) |
|
||||||
|
| Target machine | railiance01 (production home per `CUST-WP-0054`) |
|
||||||
|
| Canonical git remote (post-cutover) | `https://forgejo.coulomb.social/coulomb/<repo>.git` |
|
||||||
|
| OCI registry (post-cutover) | `forgejo.coulomb.social/coulomb/<image>` |
|
||||||
|
|
||||||
|
### Live probe (2026-07-03, post-deploy)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
getent hosts forgejo.coulomb.social # 92.205.62.239
|
||||||
|
curl -fsS -o /dev/null -w '%{http_code}\n' https://forgejo.coulomb.social/ # 200
|
||||||
|
curl -sSI -X GET https://forgejo.coulomb.social/v2/ | grep -i docker-distribution # registry/2.0
|
||||||
|
KUBECONFIG=~/.kube/config-hosteurope kubectl get pods,ingress,certificate -n forgejo
|
||||||
|
```
|
||||||
|
|
||||||
|
Forgejo is serving HTTPS with a valid Let's Encrypt cert. Gitea on coulombcore
|
||||||
|
remains canonical for git remotes until migration drills pass.
|
||||||
|
|
||||||
|
### Implications for CUST-WP-0054
|
||||||
|
|
||||||
|
- Wave 1 can proceed with a fixed hostname for overlays, ingress manifests, and
|
||||||
|
CI `IMAGE_REPOSITORY` variables.
|
||||||
|
- State Hub / sweep checkouts on railiance01 (T05) should clone from
|
||||||
|
`forgejo.coulomb.social` once cutover completes.
|
||||||
|
- Remaining T02 items (SMTP, runners, backup, cutover mode) still block
|
||||||
|
production cutover and `RAIL-HO-WP-0005-T11`.
|
||||||
|
|
||||||
|
## Open decisions (need operator input)
|
||||||
|
|
||||||
|
1. SMTP provider, sender address, and SPF/DKIM alignment for `@coulomb.social`
|
||||||
|
2. Package types beyond OCI at launch (npm, PyPI, Helm, …)
|
||||||
|
3. Actions runner: in-cluster ephemeral vs long-lived pod vs host runner
|
||||||
|
4. Backup destination and restore cadence
|
||||||
|
5. Cutover: staged project-by-project vs single freeze window
|
||||||
@@ -107,7 +107,7 @@ production dependency (likely identity/OpenBao) has moved.
|
|||||||
|
|
||||||
```task
|
```task
|
||||||
id: CUST-WP-0054-T01
|
id: CUST-WP-0054-T01
|
||||||
status: todo
|
status: done
|
||||||
priority: high
|
priority: high
|
||||||
state_hub_task_id: "67b91b18-9ad0-4917-990a-056a7007a2d4"
|
state_hub_task_id: "67b91b18-9ad0-4917-990a-056a7007a2d4"
|
||||||
```
|
```
|
||||||
@@ -124,7 +124,7 @@ host and target host. Done when every row has a target and a migration owner
|
|||||||
|
|
||||||
```task
|
```task
|
||||||
id: CUST-WP-0054-T02
|
id: CUST-WP-0054-T02
|
||||||
status: todo
|
status: done
|
||||||
priority: high
|
priority: high
|
||||||
state_hub_task_id: "4f2ae1f1-f9ad-44bb-bae7-151030634f56"
|
state_hub_task_id: "4f2ae1f1-f9ad-44bb-bae7-151030634f56"
|
||||||
```
|
```
|
||||||
@@ -148,7 +148,7 @@ emission working (partial T10 rehearsal).
|
|||||||
|
|
||||||
```task
|
```task
|
||||||
id: CUST-WP-0054-T03
|
id: CUST-WP-0054-T03
|
||||||
status: todo
|
status: done
|
||||||
priority: high
|
priority: high
|
||||||
state_hub_task_id: "70a25fbd-71d7-4d74-a04b-30e775984feb"
|
state_hub_task_id: "70a25fbd-71d7-4d74-a04b-30e775984feb"
|
||||||
```
|
```
|
||||||
@@ -165,7 +165,7 @@ authenticates through them).
|
|||||||
|
|
||||||
```task
|
```task
|
||||||
id: CUST-WP-0054-T04
|
id: CUST-WP-0054-T04
|
||||||
status: todo
|
status: progress
|
||||||
priority: high
|
priority: high
|
||||||
state_hub_task_id: "79b9ee4d-f792-434c-a2ea-2fe216a948ca"
|
state_hub_task_id: "79b9ee4d-f792-434c-a2ea-2fe216a948ca"
|
||||||
```
|
```
|
||||||
@@ -174,8 +174,11 @@ Execute/absorb `RAIL-HO-WP-0005`: Forgejo production on railiance01 becomes
|
|||||||
the canonical remote for all repos; coulombcore Gitea becomes a read-only
|
the canonical remote for all repos; coulombcore Gitea becomes a read-only
|
||||||
mirror until decommission. Stand up Actions runners so container images
|
mirror until decommission. Stand up Actions runners so container images
|
||||||
(state-hub, core-hub, issue-core, activity-core) build and push in CI from
|
(state-hub, core-hub, issue-core, activity-core) build and push in CI from
|
||||||
tags — the workstation stops being the build/publish host. Done when a
|
tags — the workstation stops being the build/publish host.
|
||||||
release ships with the workstation off.
|
|
||||||
|
**Partial (2026-07-03):** ADR-004 in-cluster runner (`railiance01-build-01` +
|
||||||
|
DinD) replaces interim coulombcore host runner. Remaining: image-build workflow
|
||||||
|
on runner, repo migration, release with workstation off.
|
||||||
|
|
||||||
## Task: State Hub production home on railiance01
|
## Task: State Hub production home on railiance01
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user