generated from coulomb/repo-seed
Some checks failed
Test / test (push) Has been cancelled
OCI image build (Nix dockerTools), Helm chart in railiance-apps, SOPS/age secrets, PostgreSQL HA on railiance-platform, Traefik ingress, Gitea Actions CI/CD. Includes dependency gate on K3s cluster readiness. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
295 lines
10 KiB
Markdown
295 lines
10 KiB
Markdown
---
|
|
id: IHUB-WP-0018
|
|
type: workplan
|
|
title: "Railiance01 Deployment — Production Operations Scaffold"
|
|
domain: inter_hub
|
|
repo: inter-hub
|
|
status: open
|
|
owner: custodian
|
|
topic_slug: inter_hub
|
|
created: "2026-04-29"
|
|
updated: "2026-04-29"
|
|
depends_on: IHUB-WP-0015
|
|
state_hub_workstream_id: "080d841a-3acd-4adf-b684-2d1890a5e986"
|
|
---
|
|
|
|
# IHUB-WP-0018 — Railiance01 Deployment: Production Operations Scaffold
|
|
|
|
## Goal
|
|
|
|
Deploy inter-hub to the Railiance01 Kubernetes cluster with fully automatic
|
|
deployment, SOPS-encrypted secrets, Traefik ingress, PostgreSQL HA, and a
|
|
Gitea Actions CI/CD pipeline. After this workplan, every push to `main`
|
|
automatically builds an OCI container image on haskelseed, pushes it to the
|
|
Railiance container registry, and deploys it — with automatic restart on node
|
|
reboot guaranteed by K3s.
|
|
|
|
## Background
|
|
|
|
inter-hub v0.2.0-alpha.1 is running on haskelseed (Alpine) via RunDevServer
|
|
and socat. That setup is a development convenience, not a production operations
|
|
scaffold. The target is the Railiance01 K3s cluster, which has:
|
|
|
|
- K3s (single-node for now; ThreePhoenix HA cluster is in progress)
|
|
- Traefik ingress with TLS
|
|
- PostgreSQL HA (repmgr + pgpool) managed by railiance-platform
|
|
- SOPS/age secret management
|
|
- Gitea with built-in container registry (or separate registry service)
|
|
- Staged Promotion Lifecycle CLI (`railiance run / deploy / promote / rollback`)
|
|
|
|
**Key constraint:** This workplan depends on Railiance01 K3s being operational.
|
|
Gate R3 verifies cluster readiness before any deployment work begins — if K3s
|
|
or the container registry is not ready, this workplan blocks there and the
|
|
cluster work must be completed first.
|
|
|
|
**IHP specifics:** IHP DevServer is a development server. For production we
|
|
build the IHP binary via `nix build` (which produces a self-contained binary)
|
|
and wrap it in a minimal OCI image using Nix's `dockerTools.buildImage`. The
|
|
app serves HTTP on port 8000; the socat workaround is not needed in Kubernetes
|
|
since Traefik routes directly to the pod's port.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
git push → Gitea Actions
|
|
→ SSH to haskelseed: nix build → docker load → docker push registry/inter-hub:$SHA
|
|
→ helm upgrade inter-hub railiance-apps/helm/inter-hub
|
|
→ Deployment (1 replica): inter-hub:$SHA + env from Secrets
|
|
→ Service (ClusterIP :8000)
|
|
→ Ingress (Traefik): hub.coulomb.social → Service
|
|
→ PersistentVolumeClaim: /app/static (generated CSS/JS)
|
|
→ PostgreSQL: database 'interhub' on railiance-platform HA cluster
|
|
```
|
|
|
|
## Tasks
|
|
|
|
### R1 — Add OCI image build to flake.nix
|
|
|
|
Add a `packages.docker` output to `flake.nix` using `pkgs.dockerTools.buildLayeredImage`.
|
|
The image wraps the IHP production binary produced by `nix build .#default`.
|
|
|
|
```nix
|
|
packages.docker = pkgs.dockerTools.buildLayeredImage {
|
|
name = "inter-hub";
|
|
tag = "latest";
|
|
contents = [ self.packages.${system}.default pkgs.cacert ];
|
|
config = {
|
|
Cmd = [ "/bin/inter-hub" ];
|
|
ExposedPorts = { "8000/tcp" = {}; };
|
|
Env = [
|
|
"PORT=8000"
|
|
"IHP_ENV=Production"
|
|
];
|
|
};
|
|
};
|
|
```
|
|
|
|
Test locally on haskelseed:
|
|
```bash
|
|
nix build .#docker
|
|
docker load < result
|
|
docker run --rm -p 8000:8000 -e DATABASE_URL=... -e IHP_SESSION_SECRET=... inter-hub:latest
|
|
```
|
|
|
|
**Note:** First build pulls the full Haskell binary closure (~2 GB); subsequent
|
|
builds are incremental (layer caching). Build must run on haskelseed — the only
|
|
machine with the Nix store populated for GHC 9.10.3.
|
|
|
|
### R2 — Verify container runs correctly
|
|
|
|
On haskelseed, run the container image against the existing `interhub` database.
|
|
Confirm:
|
|
- `curl http://localhost:8000/` returns 200 (LandingAction)
|
|
- `curl http://localhost:8000/api/v2/hubs` returns 401 (auth required)
|
|
- Static assets load (Tailwind CSS present in image)
|
|
- Container exits cleanly on SIGTERM
|
|
|
|
If Tailwind CSS output (`static/app.css`) is not bundled into the Nix binary
|
|
closure, add a pre-build step: run tailwindcss and include `static/` in the
|
|
image via `dockerTools.buildLayeredImage` `contents` or a NixOS module.
|
|
|
|
### R3 — Verify Railiance01 readiness (gate)
|
|
|
|
This is a dependency gate. Before proceeding, confirm:
|
|
|
|
```bash
|
|
# From CoulombCore (execution origin):
|
|
kubectl get nodes # must show Ready
|
|
kubectl get pods -n kube-system | grep traefik # Traefik must be running
|
|
kubectl get pods -n railiance-platform # PostgreSQL HA pods
|
|
```
|
|
|
|
Also confirm:
|
|
- Container registry is reachable from haskelseed (verify push access)
|
|
- Registry address (e.g., `registry.coulomb.social` or `gitea.coulomb.social`)
|
|
- SOPS/age key is present on CoulombCore at `~/.config/sops/age/keys.txt`
|
|
|
|
If any check fails, block here and open the relevant Railiance workstream.
|
|
Do not proceed until all checks pass.
|
|
|
|
### R4 — Provision inter-hub database on railiance-platform
|
|
|
|
On the PostgreSQL HA cluster, create the inter-hub database and user:
|
|
|
|
```sql
|
|
CREATE USER interhub WITH PASSWORD '<generated>';
|
|
CREATE DATABASE interhub OWNER interhub;
|
|
GRANT ALL PRIVILEGES ON DATABASE interhub TO interhub;
|
|
```
|
|
|
|
Run schema migration (IHP migrations) as part of the first deployment via an
|
|
init container or a manual `migrate` run inside the pod. Document the
|
|
migration procedure in `deploy/railiance/RUNBOOK.md`.
|
|
|
|
### R5 — SOPS-encrypted secrets
|
|
|
|
Create `deploy/railiance/secrets/inter-hub.env.sops.yaml` with:
|
|
|
|
```yaml
|
|
# sops encrypted — do not edit manually
|
|
DATABASE_URL: postgresql://interhub:<pass>@pgpool.railiance-platform.svc:5432/interhub
|
|
IHP_SESSION_SECRET: <64-char-hex>
|
|
IHP_BASEURL: https://hub.coulomb.social
|
|
```
|
|
|
|
Encrypt with the age key:
|
|
```bash
|
|
sops --encrypt --age $(cat ~/.config/sops/age/keys.txt | grep public | awk '{print $4}') \
|
|
deploy/railiance/secrets/inter-hub.env.sops.yaml > deploy/railiance/secrets/inter-hub.env.sops.yaml
|
|
```
|
|
|
|
Commit the encrypted file. The Gitea Actions workflow decrypts at deploy time
|
|
using the age key from a Kubernetes Secret (bootstrapped once manually).
|
|
|
|
### R6 — Helm chart in railiance-apps
|
|
|
|
Create `helm/inter-hub/` in the `railiance-apps` repository following the
|
|
Railiance app.toml contract. Minimal chart:
|
|
|
|
```
|
|
helm/inter-hub/
|
|
Chart.yaml name: inter-hub, version: 0.1.0
|
|
values.yaml image.tag, ingress.host, resources
|
|
values.prod.yaml replicas: 1, resources.requests.memory: 1Gi
|
|
templates/
|
|
deployment.yaml envFrom: secretRef inter-hub-env
|
|
service.yaml ClusterIP :8000
|
|
ingress.yaml Traefik annotations, TLS
|
|
secret.yaml created by sops-operator or external-secrets
|
|
```
|
|
|
|
`app.toml` in the inter-hub repo root for railiance CLI integration:
|
|
```toml
|
|
[app]
|
|
name = "inter-hub"
|
|
slug = "inter-hub"
|
|
kind = "native"
|
|
registry = "registry.coulomb.social/coulomb/inter-hub"
|
|
|
|
[deploy]
|
|
chart = "railiance-apps/helm/inter-hub"
|
|
namespace = "inter-hub"
|
|
```
|
|
|
|
### R7 — Gitea Actions CI/CD pipeline
|
|
|
|
Create `.gitea/workflows/deploy.yaml` in the inter-hub repo:
|
|
|
|
```yaml
|
|
on:
|
|
push:
|
|
branches: [main]
|
|
|
|
jobs:
|
|
build-and-deploy:
|
|
runs-on: ubuntu-latest # or self-hosted if available
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
|
|
- name: Build OCI image on haskelseed
|
|
run: |
|
|
ssh haskelseed "cd /root/inter-hub && git pull && \
|
|
nix build .#docker && \
|
|
docker load < result && \
|
|
docker tag inter-hub:latest $REGISTRY/inter-hub:${{ github.sha }} && \
|
|
docker push $REGISTRY/inter-hub:${{ github.sha }}"
|
|
|
|
- name: Deploy to Railiance01
|
|
run: |
|
|
ssh coulombcore "helm upgrade --install inter-hub \
|
|
railiance-apps/helm/inter-hub \
|
|
--namespace inter-hub --create-namespace \
|
|
--set image.tag=${{ github.sha }} \
|
|
-f railiance-apps/helm/inter-hub/values.prod.yaml"
|
|
```
|
|
|
|
Secrets in Gitea: `REGISTRY`, `SSH_KEY_HASKELSEED`, `SSH_KEY_COULOMBCORE`.
|
|
|
|
**Alternative if self-hosted runner is available on CoulombCore:** run the
|
|
deploy step directly without the SSH hop to coulombcore.
|
|
|
|
### R8 — Staged deployment and smoke test
|
|
|
|
Follow the Railiance staged promotion lifecycle:
|
|
|
|
1. **Local verify** (done in R2 — container runs correctly)
|
|
2. **Deploy to Railiance01:**
|
|
```bash
|
|
railiance deploy inter-hub --tag <sha>
|
|
```
|
|
3. **Smoke test:**
|
|
```bash
|
|
curl -s https://hub.coulomb.social/ | grep "Inter-Hub" # Landing page
|
|
curl -s https://hub.coulomb.social/capabilities # Capabilities
|
|
curl -H "Authorization: Bearer <key>" \
|
|
https://hub.coulomb.social/api/v2/hubs # API (200)
|
|
curl https://hub.coulomb.social/api/v2/hubs # Unauthenticated (401)
|
|
```
|
|
4. **Verify restart persistence:**
|
|
```bash
|
|
kubectl rollout restart deployment/inter-hub -n inter-hub
|
|
kubectl rollout status deployment/inter-hub -n inter-hub
|
|
# Then re-run smoke test
|
|
```
|
|
|
|
### R9 — Document and register
|
|
|
|
- Write `deploy/railiance/RUNBOOK.md`: image build, migration procedure,
|
|
secret rotation, rollback (`railiance rollback inter-hub`), log access
|
|
(`kubectl logs -n inter-hub -l app=inter-hub --tail=100`)
|
|
- Add progress event to state hub
|
|
- Remove haskelseed socat/OpenRC production role note from quickstart —
|
|
document it as the build machine only, not the production host
|
|
|
|
## Exit Criteria
|
|
|
|
- `https://hub.coulomb.social/` returns the Landing page (200, no auth)
|
|
- `/api/v2/hubs` returns 401 unauthenticated, 200 with valid API key
|
|
- All 12 IHF dashboards accessible after admin login
|
|
- `kubectl rollout restart` followed by smoke test passes (K3s restart
|
|
persistence confirmed)
|
|
- Gitea Actions pipeline: push to `main` → image built → deployed → smoke
|
|
test green within 15 minutes
|
|
- No dependency on haskelseed being up for the app to *run* (only for builds)
|
|
|
|
## Open Questions / Pre-flight Checks
|
|
|
|
1. **K3s status**: ThreePhoenix HA cluster workstream is active but not complete.
|
|
Confirm whether Railiance01 is a single-node cluster already accepting
|
|
workloads or still being provisioned. Gate R3 is the go/no-go check.
|
|
|
|
2. **Container registry**: Is Gitea's built-in registry available on Railiance01,
|
|
or is a separate registry service needed? If neither, add registry deployment
|
|
to the scope.
|
|
|
|
3. **PostgreSQL HA status**: railiance-platform baseline workstream is active.
|
|
Confirm whether the HA cluster (repmgr + pgpool) is operational before R4.
|
|
|
|
4. **Static asset bundling**: The Nix production binary may or may not include
|
|
`static/app.css` (Tailwind output). Verify in R2 and adjust image build
|
|
if needed.
|
|
|
|
5. **Anthropic API key**: Phase 5 AI-assisted distillation requires
|
|
`IHP_ANTHROPIC_API_KEY`. Add to SOPS secrets if the feature is to be
|
|
active on Railiance01.
|