diff --git a/deploy/railiance/RUNBOOK.md b/deploy/railiance/RUNBOOK.md index 9b6fe41..1e59c1e 100644 --- a/deploy/railiance/RUNBOOK.md +++ b/deploy/railiance/RUNBOOK.md @@ -1,8 +1,11 @@ -# inter-hub on Railiance01 — Runbook +# inter-hub Production Deploy Runbook ## Architecture -- **Cluster:** Railiance01 (K3s, 92.205.62.239) +- **Deployment cluster:** COULOMBCORE K3s (`92.205.130.254`) as observed from + the haskelseed runner kube context on 2026-06-14. +- **Stale public DNS host:** `hub.coulomb.social` still resolved to + `92.205.62.239` on 2026-06-14, which served the older API surface. - **Namespace:** `inter-hub` - **Image registry:** `gitea.coulomb.social/coulomb/inter-hub:` - **Database:** CloudNativePG cluster `net-kingdom-pg` in `databases` namespace @@ -14,6 +17,34 @@ `railiance-apps/charts/inter-hub` with values from `railiance-apps/helm/inter-hub-values.yaml` +## Public DNS Gate + +The app deployment can be healthy while public smoke tests still fail if DNS +points `hub.coulomb.social` at the stale host. On 2026-06-14: + +- Kubernetes reported image `gitea.coulomb.social/coulomb/inter-hub:6455902` + ready in namespace `inter-hub` on node `92.205.130.254`. +- An in-cluster probe to `http://inter-hub:8000/api/v2/hubs` returned `401`. +- Forcing public TLS to the cluster ingress also returned `401`: + `curl --resolve hub.coulomb.social:443:92.205.130.254 https://hub.coulomb.social/api/v2/hubs`. +- Normal DNS resolved `hub.coulomb.social` to `92.205.62.239`, where + `/api/v2/hubs` returned `404` and OpenAPI lacked the bootstrap paths. + +Before treating a deploy as failed, compare DNS and forced-ingress probes: + +```bash +getent ahosts hub.coulomb.social +curl -s -o /dev/null -w "%{http_code}" https://hub.coulomb.social/api/v2/hubs +curl --resolve hub.coulomb.social:443:92.205.130.254 \ + -s -o /dev/null -w "%{http_code}" \ + https://hub.coulomb.social/api/v2/hubs +``` + +The public bootstrap gate passes when the DNS A record for +`hub.coulomb.social` points at the active ingress IP (`92.205.130.254`) or the +workflow kubeconfig is intentionally rotated to deploy to the cluster behind the +current DNS target. + ## Deployment Normal deployment is handled by Gitea Actions on push to `main`: @@ -179,9 +210,11 @@ To rotate the database password: ## Smoke Test ```bash +getent ahosts hub.coulomb.social # expected: 92.205.130.254 curl -fsS https://hub.coulomb.social/ | grep "inter-hub" curl -fsS https://hub.coulomb.social/api/v2/openapi.json >/dev/null curl -s -o /dev/null -w "%{http_code}" https://hub.coulomb.social/api/v2/widgets | grep 401 +curl -s -o /dev/null -w "%{http_code}" https://hub.coulomb.social/api/v2/hubs | grep 401 ``` ## Database Connection Check diff --git a/workplans/IHUB-WP-0018-railiance01-deployment.md b/workplans/IHUB-WP-0018-railiance01-deployment.md index db77d91..2547901 100644 --- a/workplans/IHUB-WP-0018-railiance01-deployment.md +++ b/workplans/IHUB-WP-0018-railiance01-deployment.md @@ -399,6 +399,14 @@ expected unauthenticated `401` and OpenAPI exposes `/hubs`, directly protects the ops-hub bootstrap gate instead of only checking the landing page and generic widget auth gate. +**Authenticated inspection note (2026-06-14):** The stored local Tea token is +stale for `https://gitea.coulomb.social`, but runner-side inspection succeeded. +`make runner-status` in `railiance-forge` showed `act_runner` registered to +`https://gitea.coulomb.social`, started under OpenRC, and carrying the expected +`self-hosted`/`haskelseed` labels. The runner log shows task `19` for +`coulomb/inter-hub` starting at `2026-06-14T19:59:19+02:00`, matching the +`6455902` deploy trigger. + ### R8 — Staged deployment and smoke test ```task @@ -435,6 +443,18 @@ Follow the Railiance staged promotion lifecycle: `/` returns 200 and contains `inter-hub`, `/api/v2/openapi.json` returns 200, and unauthenticated `/api/v2/widgets` returns 401. +**DNS gate finding (2026-06-14):** The deployment workflow did publish and +deploy `gitea.coulomb.social/coulomb/inter-hub:6455902`; Kubernetes reports the +`inter-hub` Deployment ready on the COULOMBCORE K3s node +`92.205.130.254`. An in-cluster probe to +`http://inter-hub:8000/api/v2/hubs` returned the expected unauthenticated +`401`, and forcing public TLS to `92.205.130.254` also returned `401`. The +public DNS record for `hub.coulomb.social`, however, resolves to +`92.205.62.239`, where `/api/v2/hubs` still returns `404` and OpenAPI lacks the +bootstrap paths. The remaining production gate is therefore DNS cutover (or an +intentional kubeconfig rotation to the cluster behind `92.205.62.239`), not a +runner, build, registry, Helm, or image-content issue. + ### R9 — Document and register ```task