15 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on, state_hub_workstream_id
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | depends_on | state_hub_workstream_id |
|---|---|---|---|---|---|---|---|---|---|---|---|
| IHUB-WP-0018 | workplan | Railiance01 Deployment — Production Operations Scaffold | inter_hub | inter-hub | active | custodian | inter_hub | 2026-04-29 | 2026-06-07 | IHUB-WP-0015 | 080d841a-3acd-4adf-b684-2d1890a5e986 |
IHUB-WP-0018 — Railiance01 Deployment: Production Operations Scaffold
Goal
Deploy inter-hub to the Railiance01 Kubernetes cluster with fully automatic
deployment, SOPS-encrypted secrets, Traefik ingress, PostgreSQL HA, and a
Gitea Actions CI/CD pipeline. After this workplan, every push to main
automatically builds an OCI container image on haskelseed, pushes it to the
Railiance container registry, and deploys it — with automatic restart on node
reboot guaranteed by K3s.
Background
inter-hub v0.2.0-alpha.1 is running on haskelseed (Alpine) via RunDevServer and socat. That setup is a development convenience, not a production operations scaffold. The target is the Railiance01 K3s cluster, which has:
- K3s (single-node for now; ThreePhoenix HA cluster is in progress)
- Traefik ingress with TLS
- PostgreSQL HA (repmgr + pgpool) managed by railiance-platform
- SOPS/age secret management
- Gitea with built-in container registry (or separate registry service)
- Staged Promotion Lifecycle CLI (
railiance run / deploy / promote / rollback)
Key constraint: This workplan depends on Railiance01 K3s being operational. Gate R3 verifies cluster readiness before any deployment work begins — if K3s or the container registry is not ready, this workplan blocks there and the cluster work must be completed first.
IHP specifics: IHP DevServer is a development server. For production we
build the IHP binary via nix build (which produces a self-contained binary)
and wrap it in a minimal OCI image using Nix's dockerTools.buildImage. The
app serves HTTP on port 8000; the socat workaround is not needed in Kubernetes
since Traefik routes directly to the pod's port.
Architecture
git push → Gitea Actions
→ SSH to haskelseed: nix build → docker load → docker push registry/inter-hub:$SHA
→ helm upgrade inter-hub railiance-apps/helm/inter-hub
→ Deployment (1 replica): inter-hub:$SHA + env from Secrets
→ Service (ClusterIP :8000)
→ Ingress (Traefik): hub.coulomb.social → Service
→ PersistentVolumeClaim: /app/static (generated CSS/JS)
→ PostgreSQL: database 'interhub' on railiance-platform HA cluster
Close-out Audit - 2026-06-04
WSJF triage flagged this workplan as a close-out candidate because State Hub had no indexed task rows for it. The deployment work is not complete; this file now contains explicit task blocks so the hub can track the remaining Railiance01 deployment work instead of treating the workplan as empty.
Deployment Review - 2026-06-05
Review against the current repo and public Railiance endpoint shows the
deployment scaffold is partially implemented but the live deployment is behind
origin/main.
origin/mainis ata3d980c, which includes the completed ops-hub bootstrap API work fromIHUB-WP-0019.https://hub.coulomb.social/returns 200 and serves inter-hub.- The public OpenAPI only lists the older v2 endpoints; it does not include
/hubs,/hub-capability-manifests,/api-consumers, or/policy-scopes. - Unauthenticated
/api/v2/hubsreturns 404 publicly, while current source should route it and return 401. This means ops-hub bootstrap cannot run against production until the current image is deployed. - The registry endpoint returns the expected unauthenticated
/v2/401 challenge, but this workspace does not havekubectl, so R3 cluster readiness cannot be fully verified from here.
Tasks
R1 - Add OCI image build to flake.nix
id: IHUB-WP-0018-T01
status: done
priority: high
state_hub_task_id: "27420bd7-0f70-4793-8805-393d8d5cacfd"
Add a packages.docker output to flake.nix using pkgs.dockerTools.buildLayeredImage.
The image wraps the IHP production binary produced by nix build .#default.
packages.docker = pkgs.dockerTools.buildLayeredImage {
name = "inter-hub";
tag = "latest";
contents = [ self.packages.${system}.default pkgs.cacert ];
config = {
Cmd = [ "/bin/inter-hub" ];
ExposedPorts = { "8000/tcp" = {}; };
Env = [
"PORT=8000"
"IHP_ENV=Production"
];
};
};
Test locally on haskelseed:
nix build .#docker
docker load < result
docker run --rm -p 8000:8000 -e DATABASE_URL=... -e IHP_SESSION_SECRET=... inter-hub:latest
Note: First build pulls the full Haskell binary closure (~2 GB); subsequent builds are incremental (layer caching). Build must run on haskelseed - the only machine with the Nix store populated for GHC 9.10.3.
Implementation note (2026-06-05): flake.nix exposes packages.docker = config.packages.unoptimized-docker-image, the IHP-provided production OCI
image used by the Railiance runbook. The original buildLayeredImage sketch is
superseded by that IHP image path.
R2 — Verify container runs correctly
id: IHUB-WP-0018-T02
status: todo
priority: high
state_hub_task_id: "5ab45e4e-16bc-4feb-8b1b-e8eeb05bf39a"
On haskelseed, run the container image against the existing interhub database.
Confirm:
curl http://localhost:8000/returns 200 (LandingAction)curl http://localhost:8000/api/v2/hubsreturns 401 (auth required)- Static assets load (Tailwind CSS present in image)
- Container exits cleanly on SIGTERM
If Tailwind CSS output (static/app.css) is not bundled into the Nix binary
closure, add a pre-build step: run tailwindcss and include static/ in the
image via dockerTools.buildLayeredImage contents or a NixOS module.
R3 — Verify Railiance01 readiness (gate)
id: IHUB-WP-0018-T03
status: blocked
priority: high
state_hub_task_id: "79b5cf2c-3a5b-4b4b-8f84-f635cb6891c1"
This is a dependency gate. Before proceeding, confirm:
# From CoulombCore (execution origin):
kubectl get nodes # must show Ready
kubectl get pods -n kube-system | grep traefik # Traefik must be running
kubectl get pods -n railiance-platform # PostgreSQL HA pods
Also confirm:
- Container registry is reachable from haskelseed (verify push access)
- Registry address (e.g.,
registry.coulomb.socialorgitea.coulomb.social) - SOPS/age key is present on CoulombCore at
~/.config/sops/age/keys.txt
If any check fails, block here and open the relevant Railiance workstream. Do not proceed until all checks pass.
Review note (2026-06-05): Public smoke probes show
https://hub.coulomb.social/ returning 200 and the Gitea registry /v2/
endpoint returning the expected unauthenticated 401 challenge. Full R3 remains
blocked from this workspace because kubectl is not available here, and the
live app is not serving the current origin/main v2 bootstrap routes.
R4 — Provision inter-hub database on railiance-platform
id: IHUB-WP-0018-T04
status: blocked
priority: high
state_hub_task_id: "c937cf36-3850-4ab3-aa83-2d846e1a378e"
On the PostgreSQL HA cluster, create the inter-hub database and user:
CREATE USER interhub WITH PASSWORD '<generated>';
CREATE DATABASE interhub OWNER interhub;
GRANT ALL PRIVILEGES ON DATABASE interhub TO interhub;
Run schema migration (IHP migrations) as part of the first deployment via an
init container or a manual migrate run inside the pod. Document the
migration procedure in deploy/railiance/RUNBOOK.md.
R5 — SOPS-encrypted secrets
id: IHUB-WP-0018-T05
status: blocked
priority: high
state_hub_task_id: "926f82d1-15cd-425d-8a41-3d6b51c07f0b"
Create deploy/railiance/secrets/inter-hub.env.sops.yaml with:
# sops encrypted — do not edit manually
DATABASE_URL: postgresql://interhub:<pass>@pgpool.railiance-platform.svc:5432/interhub
IHP_SESSION_SECRET: <64-char-hex>
IHP_BASEURL: https://hub.coulomb.social
Encrypt with the age key:
sops --encrypt --age $(cat ~/.config/sops/age/keys.txt | grep public | awk '{print $4}') \
deploy/railiance/secrets/inter-hub.env.sops.yaml > deploy/railiance/secrets/inter-hub.env.sops.yaml
Commit the encrypted file. The Gitea Actions workflow decrypts at deploy time using the age key from a Kubernetes Secret (bootstrapped once manually).
R6 — Helm chart in railiance-apps
id: IHUB-WP-0018-T06
status: in_progress
priority: high
state_hub_task_id: "4c4acc98-5773-4289-ad57-03f3fd5c381c"
Create helm/inter-hub/ in the railiance-apps repository following the
Railiance app.toml contract. Minimal chart:
helm/inter-hub/
Chart.yaml name: inter-hub, version: 0.1.0
values.yaml image.tag, ingress.host, resources
values.prod.yaml replicas: 1, resources.requests.memory: 1Gi
templates/
deployment.yaml envFrom: secretRef inter-hub-env
service.yaml ClusterIP :8000
ingress.yaml Traefik annotations, TLS
secret.yaml created by sops-operator or external-secrets
app.toml in the inter-hub repo root for railiance CLI integration:
[app]
name = "inter-hub"
slug = "inter-hub"
kind = "native"
registry = "registry.coulomb.social/coulomb/inter-hub"
[deploy]
chart = "railiance-apps/helm/inter-hub"
namespace = "inter-hub"
Implementation note (2026-06-05): A Helm chart exists in
deploy/helm/inter-hub/ with Deployment, Service, Ingress, and values for the
current Gitea registry and hub.coulomb.social. Remaining gaps: no repo-root
app.toml, no committed SOPS secret manifest, and no separate
railiance-apps/helm/inter-hub handoff in this repo.
R7 — Gitea Actions CI/CD pipeline
id: IHUB-WP-0018-T07
status: blocked
priority: medium
state_hub_task_id: "ec25c67c-3cb0-4534-9fb0-9bd6578a2def"
Create .gitea/workflows/deploy.yaml in the inter-hub repo:
on:
push:
branches: [main]
jobs:
build-and-deploy:
runs-on: ubuntu-latest # or self-hosted if available
steps:
- uses: actions/checkout@v4
- name: Build OCI image on haskelseed
run: |
ssh haskelseed "cd /root/inter-hub && git pull && \
nix build .#docker && \
docker load < result && \
docker tag inter-hub:latest $REGISTRY/inter-hub:${{ github.sha }} && \
docker push $REGISTRY/inter-hub:${{ github.sha }}"
- name: Deploy to Railiance01
run: |
ssh coulombcore "helm upgrade --install inter-hub \
railiance-apps/helm/inter-hub \
--namespace inter-hub --create-namespace \
--set image.tag=${{ github.sha }} \
-f railiance-apps/helm/inter-hub/values.prod.yaml"
Secrets in Gitea: REGISTRY, SSH_KEY_HASKELSEED, SSH_KEY_COULOMBCORE.
Alternative if self-hosted runner is available on CoulombCore: run the deploy step directly without the SSH hop to coulombcore.
Implementation note (2026-06-05): .gitea/workflows/deploy.yaml exists and
builds .#docker on a self-hosted haskelseed runner, pushes to
92.205.130.254:32166/coulomb/inter-hub, deploys with Helm, and smoke-tests
the public endpoint. Remote main is already current, but production is still
serving an older API surface, so the workflow needs an attended rerun/inspection
or a new deployment trigger.
Runner substrate finding (2026-06-07): Pushed commits fa96fb8 and
7cc3173 to trigger the workflow, but public /api/v2/hubs remained 404
while / stayed 200, indicating the current image was not deployed. Repo
search shows railiance-forge owns Actions runner substrate, but its
2026-06-05 migration plan explicitly lists "No Actions runner deployment" as a
non-goal and no runner manifest/script/workplan exists there yet. haskelseed
itself is reachable on SSH and historical port 8080, but this workspace cannot
authenticate non-interactively. Treat R7 as blocked on a forge-owned runner
prerequisite rather than continuing to push commits as deployment probes.
R8 — Staged deployment and smoke test
id: IHUB-WP-0018-T08
status: blocked
priority: high
state_hub_task_id: "2b02ae5c-47b9-4f09-88f0-a4af7900b38f"
Follow the Railiance staged promotion lifecycle:
- Local verify (done in R2 — container runs correctly)
- Deploy to Railiance01:
railiance deploy inter-hub --tag <sha> - Smoke test:
curl -s https://hub.coulomb.social/ | grep "Inter-Hub" # Landing page curl -s https://hub.coulomb.social/capabilities # Capabilities curl -H "Authorization: Bearer <key>" \ https://hub.coulomb.social/api/v2/hubs # API (200) curl https://hub.coulomb.social/api/v2/hubs # Unauthenticated (401) - Verify restart persistence:
kubectl rollout restart deployment/inter-hub -n inter-hub kubectl rollout status deployment/inter-hub -n inter-hub # Then re-run smoke test
R9 — Document and register
id: IHUB-WP-0018-T09
status: in_progress
priority: medium
state_hub_task_id: "4d1e55c7-8dbb-480f-b07b-6c5e39a04218"
- Write
deploy/railiance/RUNBOOK.md: image build, migration procedure, secret rotation, rollback (railiance rollback inter-hub), log access (kubectl logs -n inter-hub -l app=inter-hub --tail=100) - Add progress event to state hub
- Remove haskelseed socat/OpenRC production role note from quickstart - document it as the build machine only, not the production host
Implementation note (2026-06-05): deploy/railiance/RUNBOOK.md exists and
documents architecture, image build/push, Helm deployment, logs, restart,
rollback, secret rotation, and smoke checks. The deployment record remains
incomplete until current main is running and the ops-hub bootstrap smoke test
passes against production.
Exit Criteria
https://hub.coulomb.social/returns the Landing page (200, no auth)/api/v2/hubsreturns 401 unauthenticated, 200 with valid API key- All 12 IHF dashboards accessible after admin login
kubectl rollout restartfollowed by smoke test passes (K3s restart persistence confirmed)- Gitea Actions pipeline: push to
main→ image built → deployed → smoke test green within 15 minutes - No dependency on haskelseed being up for the app to run (only for builds)
Open Questions / Pre-flight Checks
-
K3s status: ThreePhoenix HA cluster workstream is active but not complete. Confirm whether Railiance01 is a single-node cluster already accepting workloads or still being provisioned. Gate R3 is the go/no-go check.
-
Container registry: Is Gitea's built-in registry available on Railiance01, or is a separate registry service needed? If neither, add registry deployment to the scope.
-
PostgreSQL HA status: railiance-platform baseline workstream is active. Confirm whether the HA cluster (repmgr + pgpool) is operational before R4.
-
Static asset bundling: The Nix production binary may or may not include
static/app.css(Tailwind output). Verify in R2 and adjust image build if needed. -
Anthropic API key: Phase 5 AI-assisted distillation requires
IHP_ANTHROPIC_API_KEY. Add to SOPS secrets if the feature is to be active on Railiance01.