--- id: IHUB-WP-0018 type: workplan title: "Railiance01 Deployment — Production Operations Scaffold" domain: inter_hub repo: inter-hub status: open owner: custodian topic_slug: inter_hub created: "2026-04-29" updated: "2026-04-29" depends_on: IHUB-WP-0015 state_hub_workstream_id: "080d841a-3acd-4adf-b684-2d1890a5e986" --- # IHUB-WP-0018 — Railiance01 Deployment: Production Operations Scaffold ## Goal Deploy inter-hub to the Railiance01 Kubernetes cluster with fully automatic deployment, SOPS-encrypted secrets, Traefik ingress, PostgreSQL HA, and a Gitea Actions CI/CD pipeline. After this workplan, every push to `main` automatically builds an OCI container image on haskelseed, pushes it to the Railiance container registry, and deploys it — with automatic restart on node reboot guaranteed by K3s. ## Background inter-hub v0.2.0-alpha.1 is running on haskelseed (Alpine) via RunDevServer and socat. That setup is a development convenience, not a production operations scaffold. The target is the Railiance01 K3s cluster, which has: - K3s (single-node for now; ThreePhoenix HA cluster is in progress) - Traefik ingress with TLS - PostgreSQL HA (repmgr + pgpool) managed by railiance-platform - SOPS/age secret management - Gitea with built-in container registry (or separate registry service) - Staged Promotion Lifecycle CLI (`railiance run / deploy / promote / rollback`) **Key constraint:** This workplan depends on Railiance01 K3s being operational. Gate R3 verifies cluster readiness before any deployment work begins — if K3s or the container registry is not ready, this workplan blocks there and the cluster work must be completed first. **IHP specifics:** IHP DevServer is a development server. For production we build the IHP binary via `nix build` (which produces a self-contained binary) and wrap it in a minimal OCI image using Nix's `dockerTools.buildImage`. The app serves HTTP on port 8000; the socat workaround is not needed in Kubernetes since Traefik routes directly to the pod's port. ## Architecture ``` git push → Gitea Actions → SSH to haskelseed: nix build → docker load → docker push registry/inter-hub:$SHA → helm upgrade inter-hub railiance-apps/helm/inter-hub → Deployment (1 replica): inter-hub:$SHA + env from Secrets → Service (ClusterIP :8000) → Ingress (Traefik): hub.coulomb.social → Service → PersistentVolumeClaim: /app/static (generated CSS/JS) → PostgreSQL: database 'interhub' on railiance-platform HA cluster ``` ## Tasks ### R1 — Add OCI image build to flake.nix Add a `packages.docker` output to `flake.nix` using `pkgs.dockerTools.buildLayeredImage`. The image wraps the IHP production binary produced by `nix build .#default`. ```nix packages.docker = pkgs.dockerTools.buildLayeredImage { name = "inter-hub"; tag = "latest"; contents = [ self.packages.${system}.default pkgs.cacert ]; config = { Cmd = [ "/bin/inter-hub" ]; ExposedPorts = { "8000/tcp" = {}; }; Env = [ "PORT=8000" "IHP_ENV=Production" ]; }; }; ``` Test locally on haskelseed: ```bash nix build .#docker docker load < result docker run --rm -p 8000:8000 -e DATABASE_URL=... -e IHP_SESSION_SECRET=... inter-hub:latest ``` **Note:** First build pulls the full Haskell binary closure (~2 GB); subsequent builds are incremental (layer caching). Build must run on haskelseed — the only machine with the Nix store populated for GHC 9.10.3. ### R2 — Verify container runs correctly On haskelseed, run the container image against the existing `interhub` database. Confirm: - `curl http://localhost:8000/` returns 200 (LandingAction) - `curl http://localhost:8000/api/v2/hubs` returns 401 (auth required) - Static assets load (Tailwind CSS present in image) - Container exits cleanly on SIGTERM If Tailwind CSS output (`static/app.css`) is not bundled into the Nix binary closure, add a pre-build step: run tailwindcss and include `static/` in the image via `dockerTools.buildLayeredImage` `contents` or a NixOS module. ### R3 — Verify Railiance01 readiness (gate) This is a dependency gate. Before proceeding, confirm: ```bash # From CoulombCore (execution origin): kubectl get nodes # must show Ready kubectl get pods -n kube-system | grep traefik # Traefik must be running kubectl get pods -n railiance-platform # PostgreSQL HA pods ``` Also confirm: - Container registry is reachable from haskelseed (verify push access) - Registry address (e.g., `registry.coulomb.social` or `gitea.coulomb.social`) - SOPS/age key is present on CoulombCore at `~/.config/sops/age/keys.txt` If any check fails, block here and open the relevant Railiance workstream. Do not proceed until all checks pass. ### R4 — Provision inter-hub database on railiance-platform On the PostgreSQL HA cluster, create the inter-hub database and user: ```sql CREATE USER interhub WITH PASSWORD ''; CREATE DATABASE interhub OWNER interhub; GRANT ALL PRIVILEGES ON DATABASE interhub TO interhub; ``` Run schema migration (IHP migrations) as part of the first deployment via an init container or a manual `migrate` run inside the pod. Document the migration procedure in `deploy/railiance/RUNBOOK.md`. ### R5 — SOPS-encrypted secrets Create `deploy/railiance/secrets/inter-hub.env.sops.yaml` with: ```yaml # sops encrypted — do not edit manually DATABASE_URL: postgresql://interhub:@pgpool.railiance-platform.svc:5432/interhub IHP_SESSION_SECRET: <64-char-hex> IHP_BASEURL: https://hub.coulomb.social ``` Encrypt with the age key: ```bash sops --encrypt --age $(cat ~/.config/sops/age/keys.txt | grep public | awk '{print $4}') \ deploy/railiance/secrets/inter-hub.env.sops.yaml > deploy/railiance/secrets/inter-hub.env.sops.yaml ``` Commit the encrypted file. The Gitea Actions workflow decrypts at deploy time using the age key from a Kubernetes Secret (bootstrapped once manually). ### R6 — Helm chart in railiance-apps Create `helm/inter-hub/` in the `railiance-apps` repository following the Railiance app.toml contract. Minimal chart: ``` helm/inter-hub/ Chart.yaml name: inter-hub, version: 0.1.0 values.yaml image.tag, ingress.host, resources values.prod.yaml replicas: 1, resources.requests.memory: 1Gi templates/ deployment.yaml envFrom: secretRef inter-hub-env service.yaml ClusterIP :8000 ingress.yaml Traefik annotations, TLS secret.yaml created by sops-operator or external-secrets ``` `app.toml` in the inter-hub repo root for railiance CLI integration: ```toml [app] name = "inter-hub" slug = "inter-hub" kind = "native" registry = "registry.coulomb.social/coulomb/inter-hub" [deploy] chart = "railiance-apps/helm/inter-hub" namespace = "inter-hub" ``` ### R7 — Gitea Actions CI/CD pipeline Create `.gitea/workflows/deploy.yaml` in the inter-hub repo: ```yaml on: push: branches: [main] jobs: build-and-deploy: runs-on: ubuntu-latest # or self-hosted if available steps: - uses: actions/checkout@v4 - name: Build OCI image on haskelseed run: | ssh haskelseed "cd /root/inter-hub && git pull && \ nix build .#docker && \ docker load < result && \ docker tag inter-hub:latest $REGISTRY/inter-hub:${{ github.sha }} && \ docker push $REGISTRY/inter-hub:${{ github.sha }}" - name: Deploy to Railiance01 run: | ssh coulombcore "helm upgrade --install inter-hub \ railiance-apps/helm/inter-hub \ --namespace inter-hub --create-namespace \ --set image.tag=${{ github.sha }} \ -f railiance-apps/helm/inter-hub/values.prod.yaml" ``` Secrets in Gitea: `REGISTRY`, `SSH_KEY_HASKELSEED`, `SSH_KEY_COULOMBCORE`. **Alternative if self-hosted runner is available on CoulombCore:** run the deploy step directly without the SSH hop to coulombcore. ### R8 — Staged deployment and smoke test Follow the Railiance staged promotion lifecycle: 1. **Local verify** (done in R2 — container runs correctly) 2. **Deploy to Railiance01:** ```bash railiance deploy inter-hub --tag ``` 3. **Smoke test:** ```bash curl -s https://hub.coulomb.social/ | grep "Inter-Hub" # Landing page curl -s https://hub.coulomb.social/capabilities # Capabilities curl -H "Authorization: Bearer " \ https://hub.coulomb.social/api/v2/hubs # API (200) curl https://hub.coulomb.social/api/v2/hubs # Unauthenticated (401) ``` 4. **Verify restart persistence:** ```bash kubectl rollout restart deployment/inter-hub -n inter-hub kubectl rollout status deployment/inter-hub -n inter-hub # Then re-run smoke test ``` ### R9 — Document and register - Write `deploy/railiance/RUNBOOK.md`: image build, migration procedure, secret rotation, rollback (`railiance rollback inter-hub`), log access (`kubectl logs -n inter-hub -l app=inter-hub --tail=100`) - Add progress event to state hub - Remove haskelseed socat/OpenRC production role note from quickstart — document it as the build machine only, not the production host ## Exit Criteria - `https://hub.coulomb.social/` returns the Landing page (200, no auth) - `/api/v2/hubs` returns 401 unauthenticated, 200 with valid API key - All 12 IHF dashboards accessible after admin login - `kubectl rollout restart` followed by smoke test passes (K3s restart persistence confirmed) - Gitea Actions pipeline: push to `main` → image built → deployed → smoke test green within 15 minutes - No dependency on haskelseed being up for the app to *run* (only for builds) ## Open Questions / Pre-flight Checks 1. **K3s status**: ThreePhoenix HA cluster workstream is active but not complete. Confirm whether Railiance01 is a single-node cluster already accepting workloads or still being provisioned. Gate R3 is the go/no-go check. 2. **Container registry**: Is Gitea's built-in registry available on Railiance01, or is a separate registry service needed? If neither, add registry deployment to the scope. 3. **PostgreSQL HA status**: railiance-platform baseline workstream is active. Confirm whether the HA cluster (repmgr + pgpool) is operational before R4. 4. **Static asset bundling**: The Nix production binary may or may not include `static/app.css` (Tailwind output). Verify in R2 and adjust image build if needed. 5. **Anthropic API key**: Phase 5 AI-assisted distillation requires `IHP_ANTHROPIC_API_KEY`. Add to SOPS secrets if the feature is to be active on Railiance01.