From 1ba64dd00f44472d50c01b7a18167af02db5288c Mon Sep 17 00:00:00 2001 From: tegwick Date: Sun, 14 Jun 2026 21:47:03 +0200 Subject: [PATCH] docs(deploy): record production gate recovery --- deploy/railiance/RUNBOOK.md | 36 +++++++++++++++---- .../IHUB-WP-0018-railiance01-deployment.md | 22 ++++++++++++ 2 files changed, 52 insertions(+), 6 deletions(-) diff --git a/deploy/railiance/RUNBOOK.md b/deploy/railiance/RUNBOOK.md index 1e59c1e..79e23c4 100644 --- a/deploy/railiance/RUNBOOK.md +++ b/deploy/railiance/RUNBOOK.md @@ -158,19 +158,43 @@ evidence. ## Database Migration -IHP migrations can be run from the production image when needed. Because the -image is Nix-built and may not contain a shell, first inspect the binary path: +The current Nix production image is intentionally minimal: image metadata for +`6455902` points at +`/nix/store/-inter-hub/bin/RunProdServer`, and the package contains only +`RunProdServer` and `RunJobs`. It has no shell and no packaged migration +runner, so schema work is performed through the CloudNativePG pod. +Check schema state: ```bash -kubectl exec -n inter-hub deploy/inter-hub -- find /nix/store -path '*inter-hub*/bin/RunProdServer' -kubectl exec -n inter-hub deploy/inter-hub -- /nix/store/-inter-hub/bin/RunProdServer migrate +kubectl exec -n databases net-kingdom-pg-1 -- \ + psql -d interhub -Atc "SELECT count(*) FROM information_schema.tables WHERE table_schema = 'public';" ``` -To check migration status: +Initialize a blank production database from the canonical schema: ```bash -kubectl exec -n databases net-kingdom-pg-1 -- psql -U postgres interhub -c "\dt" +kubectl exec -i -n databases net-kingdom-pg-1 -- \ + psql -d interhub -v ON_ERROR_STOP=1 -1 -f - < Application/Schema.sql + +kubectl exec -i -n databases net-kingdom-pg-1 -- \ + psql -d interhub -v ON_ERROR_STOP=1 -1 -f - < Application/Migration/1744502400-seed-type-registries.sql + +kubectl exec -i -n databases net-kingdom-pg-1 -- psql -d interhub -v ON_ERROR_STOP=1 -1 -f - <<'SQL' +GRANT USAGE ON SCHEMA public TO interhub; +GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO interhub; +GRANT USAGE, SELECT, UPDATE ON ALL SEQUENCES IN SCHEMA public TO interhub; +GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO interhub; +ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO interhub; +ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT USAGE, SELECT, UPDATE ON SEQUENCES TO interhub; +ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT EXECUTE ON FUNCTIONS TO interhub; +SQL + +kubectl rollout restart deployment/inter-hub -n inter-hub +kubectl rollout status deployment/inter-hub -n inter-hub ``` +Do not apply `1744416000-seed-admin-user.sql` unattended in production; it uses +a documented default password intended for initial local deployment only. + ## Logs ```bash diff --git a/workplans/IHUB-WP-0018-railiance01-deployment.md b/workplans/IHUB-WP-0018-railiance01-deployment.md index 2547901..be1be18 100644 --- a/workplans/IHUB-WP-0018-railiance01-deployment.md +++ b/workplans/IHUB-WP-0018-railiance01-deployment.md @@ -213,6 +213,16 @@ the Railiance PostgreSQL cluster: role `interhub`, database `interhub`, schema ownership, and privileges were created/updated. The running deployment now uses that database through the `inter-hub-env` Kubernetes Secret. +**Production initialization note (2026-06-14):** After DNS/TLS and network +access were restored, production OpenAPI still failed because the `interhub` +database was blank (`public_table_count:0`). The IHP production image only +contains `RunProdServer` and `RunJobs`, so there was no packaged migration +runner to execute. Initialized the database through the CloudNativePG pod by +loading `Application/Schema.sql` in one transaction, applying the idempotent +type-registry seed migration `1744502400`, and granting app privileges on the +new schema to the `interhub` role. The default admin seed with a known password +was intentionally not applied to production. + ### R5 — SOPS-encrypted secrets ```task @@ -455,6 +465,18 @@ bootstrap paths. The remaining production gate is therefore DNS cutover (or an intentional kubeconfig rotation to the cluster behind `92.205.62.239`), not a runner, build, registry, Helm, or image-content issue. +**Production gate completion note (2026-06-14):** DNS for +`hub.coulomb.social` now resolves to `92.205.130.254`, cert-manager issued a +Let's Encrypt certificate for the host, and the app deployment is serving image +`gitea.coulomb.social/coulomb/inter-hub:6455902`. The final blockers were +database ingress from `inter-hub` to `net-kingdom-pg` and the blank production +schema. Added/applied the platform NetworkPolicy, initialized the `interhub` +schema and framework type registries, granted privileges to the app role, and +restarted the deployment. The ops-hub gate probe now passes: +`/api/v2/hubs` returns the expected unauthenticated `401`, +`/api/v2/openapi.json` returns `200`, and OpenAPI exposes `/hubs`, +`/hub-capability-manifests`, `/api-consumers`, and `/policy-scopes`. + ### R9 — Document and register ```task