docs: add INTENT, DeploymentBlueprint, and pipeline workplan

INTENT.md captures why the probe exists (validate build-to-deploy before
inter-hub attempts production again) and its success criteria.

DeploymentBlueprint.md is a textual C4 deployment diagram covering all four
nodes: workstation, haskelseed, CoulombCore/Gitea, and Railiance01/k3s, plus
the full artifact flow and known infrastructure constraints.

IRP-WP-0001 is a 12-task workplan: flake bootstrap → minimal IHP scaffold →
schema → health endpoint → Hspec test → production build → push → Helm chart →
k3s registry config → deploy → smoke test → optional CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-02 16:19:46 +02:00
parent 037bc1c12f
commit 52f312bf8e
3 changed files with 634 additions and 0 deletions

146
DeploymentBlueprint.md Normal file
View File

@@ -0,0 +1,146 @@
# Deployment Blueprint — ihp-railiance-probe
Textual equivalent of a C4 Deployment Diagram covering the full
build-to-production cycle. Read top-to-bottom as the artifact flows through
each node.
---
## Deployment Nodes
### Node: Developer Workstation
- **Host**: `localhost` (WSL2 on Windows, `linux/x86_64`)
- **Role**: source editing, flake authoring, orchestration
- **Key tools**: git, sshpass, skopeo (client), nix (local, for dry-runs only)
- **Constraints**: not used for Haskell compilation — insufficient RAM for GHC
on large module graphs
### Node: haskelseed (Build Server)
- **Host**: `192.168.178.135` (Alpine Linux VM, LAN-local to workstation)
- **SSH access**: `root` / `hcs26!x` (password) or via `ssh haskelseed` alias
- **Role**: sole Haskell compilation and `nix build` host
- **Resources**: 2 CPU, ~3.8 GiB RAM, 100 GB NVMe at `/nix` (`/dev/sdb1`)
- **Nix store**: `/nix/store` on dedicated volume; store paths survive reboots
- **Key constraint**: `libHSghc-9.10.3-5702.a` in the Nix store may be truncated
(287 MB vs full 289 MB); must be patched before production builds if affected
(see `inter-hub/HaskellVibePrimer.md` §Bug 2)
- **GHC version**: 9.10.3 (from IHP v1.5 flake)
- **GHCRTS**: `-A32m -M2g` (heap ceiling to prevent OOM)
### Node: CoulombCore VPS (Registry + Gitea)
- **Host**: `92.205.130.254`
- **SSH access**: `tegwick` via `id_ops` key (alias `coulombcore`)
- **Role**: Gitea source hosting + built-in OCI container registry
- **Gitea SSH**: port `30022`, alias `gitea-remote` in `~/.ssh/config`
- **Registry endpoint**: `92.205.130.254:32166` (HTTP, no TLS — internal use)
- **Image namespace**: `coulomb/ihp-railiance-probe`
- **Registry auth**: Gitea credentials (same user as Gitea login)
### Node: Railiance01 (Production Cluster)
- **Host**: `92.205.62.239`
- **SSH access**: `tegwick` via `id_custodian_agent` key (alias `railiance01`)
- **Role**: k3s Kubernetes cluster, deployment target
- **Namespace**: `coulomb` (shared with inter-hub)
- **Image pull**: cluster pulls from `92.205.130.254:32166` (LAN-adjacent VPS,
no auth needed if registry is public or cluster has credentials configured)
- **Ingress**: Traefik (k3s default); routes via IngressRoute or Ingress manifest
---
## Deployment Pipeline — Step by Step
```
[Workstation]
│ git push (SSH via gitea-remote)
[CoulombCore — Gitea]
│ (no CI yet; developer triggers manually)
[Workstation]
│ scp flake.nix + source → haskelseed
│ (or: git push + git pull on haskelseed)
[haskelseed — Build]
│ nix build .#docker
│ → evaluates flake.nix
│ → builds inter-hub-models (GHC, 477 modules) ← cached after first build
│ → builds inter-hub-lib (GHC, 199 modules)
│ → builds inter-hub-binaries
│ → assembles OCI tarball (result → /root/ihp-railiance-probe/result)
│ skopeo copy docker-archive:result
│ docker://92.205.130.254:32166/coulomb/ihp-railiance-probe:<SHA>
[CoulombCore — Registry]
│ image stored as coulomb/ihp-railiance-probe:<SHA>
[Railiance01 — Kubernetes / k3s]
│ helm upgrade --install ihp-railiance-probe ./chart
│ --set image.tag=<SHA>
│ --namespace coulomb
│ k3s pulls image from 92.205.130.254:32166
│ Deployment → ReplicaSet → Pod (RunProdServer binary)
│ Service (ClusterIP) → IngressRoute (Traefik)
[External / Browser]
GET https://probe.railiance.example/
```
---
## Container: ihp-railiance-probe Application
- **Base**: IHP unoptimized Docker image (`config.packages.unoptimized-docker-image`)
- **Entry point**: `/bin/RunProdServer`
- **Exposed port**: `8000` (IHP default)
- **Environment variables** (injected via Kubernetes Secret / ConfigMap):
| Variable | Purpose |
|----------|---------|
| `IHP_SESSION_SECRET` | Session encryption key (32+ random bytes, base64) |
| `DATABASE_URL` | PostgreSQL connection string |
| `IHP_BASEURL` | External URL shown in links (e.g. `https://probe.coulomb.example`) |
- **PostgreSQL**: deployed as a separate pod (`bitnami/postgresql`) or uses the
shared CoulombCore postgres — TBD per cluster capacity
---
## Artifact Versioning
| Artifact | Identifier | Retention |
|----------|------------|-----------|
| Git commit | `<SHA>` (short, from `git rev-parse --short HEAD`) | permanent in Gitea |
| OCI image | `coulomb/ihp-railiance-probe:<SHA>` | keep last 5 tags |
| Helm release | `ihp-railiance-probe` in namespace `coulomb` | single release, upgraded in-place |
| Nix build result | `/root/ihp-railiance-probe/result` symlink on haskelseed | GC'd by `nix store gc` |
---
## Known Infrastructure Constraints
| Constraint | Impact | Mitigation |
|------------|--------|------------|
| haskelseed 2 CPU / 3.8 GB RAM | Full GHC build saturates RAM | `GHCRTS=-A32m -M2g`; `-j1` in flake |
| GHC 9.10.3 `.hi` overflow (>274 MB) | Crash after all modules compile | ActualTypes postUnpack overlay in flake.nix |
| GHC 9.10.3 `libHSghc.a` truncated | Crash at position 287,686,318 | Direct archive patch on haskelseed (one-time; check after flake lock update) |
| Registry on HTTP (no TLS) | k3s defaults to HTTPS for pulls | Configure k3s `registries.yaml` with mirror entry for `92.205.130.254:32166` |
| No CI runner yet | Manual build + push | Phase 6 of workplan adds Gitea Actions runner on haskelseed |
---
## Key File Locations
| File | Node | Path |
|------|------|------|
| Nix flake | Workstation + haskelseed | `flake.nix` |
| Helm chart | Workstation | `chart/` |
| GHC archive (may be truncated) | haskelseed | `/nix/store/ffg3yf2ypnbz3hc31y7nglrkihz0if01-ghc-9.10.3/lib/ghc-9.10.3/lib/x86_64-linux-ghc-9.10.3/ghc-9.10.3-5702/libHSghc-9.10.3-5702.a` |
| Build log | haskelseed | `/tmp/build<N>.log` |
| k3s registries config | Railiance01 | `/etc/rancher/k3s/registries.yaml` |

63
INTENT.md Normal file
View File

@@ -0,0 +1,63 @@
# ihp-railiance-probe — Intent
## Purpose
`ihp-railiance-probe` is a minimal, test-first IHP application whose sole purpose
is to **validate the complete build-to-deployment pipeline** for IHP-on-Railiance
before any production workload runs through it.
It is a *probe* — not a product. Every decision favours diagnostic speed over
feature richness.
## Why it exists
Building `inter-hub` revealed two classes of expensive problems:
1. **Build-time surprises** — GHC 9.10.3 bugs (oversized `.hi` files, truncated
static archives) that only manifest in production (`nix build .#docker`), not
in the development GHCi loop. Debugging these inside a 477-module codebase
cost many build cycles and hours of iteration.
2. **Pipeline unknowns** — The full chain (haskelseed build → Gitea registry push
→ Railiance01 Kubernetes deployment) had never been exercised end-to-end in a
clean, reproducible way before inter-hub attempted production.
A minimal probe surfaces both problem classes on a trivially small codebase where
failures are cheap to diagnose and fix.
## Success criteria
The probe is *complete* when all of the following hold without manual intervention:
| # | Check | Verified by |
|---|-------|-------------|
| 1 | `nix build .#docker` succeeds on haskelseed | Zero-error build log |
| 2 | OCI image pushed to `92.205.130.254:32166/coulomb/ihp-railiance-probe:SHA` | `skopeo inspect` |
| 3 | Helm chart deploys pod to Railiance01 | `kubectl get pods` — Running |
| 4 | Health endpoint responds | `curl /healthz` → 200 |
| 5 | At least one Hspec integration test passes in CI | `test` command green |
## Design constraints
- **Minimal schema**: one table (`probes`) with `id`, `name`, `created_at`.
Enough to exercise the IHP code-gen path without generating hundreds of modules.
- **Test-first**: every controller action has a corresponding Hspec test before
it is implemented.
- **Diagnostic-first flake**: `flake.nix` starts from the hard-won inter-hub
overlay (ActualTypes.hi fix, haskelseed resource limits) so GHC 9.10.3 quirks
are handled from day one.
- **No application logic**: the probe is a canary. Application features belong
in inter-hub or its successors.
## Relationship to inter-hub
`ihp-railiance-probe` is upstream in confidence, not in code. A successful probe
means:
- The Nix/GHC/cabal production build is healthy for the current flake lock.
- The Gitea registry + Railiance01 deployment chain is operational.
- A new IHP project can be promoted to production without the build archaeology
that inter-hub required.
Once the pipeline is validated, the probe continues to serve as a **regression
canary**: rebuild it whenever the flake lock is updated before touching inter-hub.

View File

@@ -0,0 +1,425 @@
---
id: IRP-WP-0001
type: workplan
title: "ihp-railiance-probe — Full Pipeline Validation"
domain: coulomb
repo: ihp-railiance-probe
status: todo
owner: tegwick
created: "2026-05-02"
updated: "2026-05-02"
---
# ihp-railiance-probe — Full Pipeline Validation
## Goal
Stand up a minimal IHP application that successfully traverses the complete
build-to-production cycle: `nix build` on haskelseed → OCI push to Gitea
registry → Helm deploy to Railiance01 → live HTTP response. The probe carries
one Hspec integration test to prove the test-first loop is closed.
## Background
`inter-hub` exposed two GHC 9.10.3 production-build bugs and an unvalidated
deployment pipeline. This probe exercises the same stack on a trivially small
codebase (one schema table, one controller, one test) so failures are cheap
to diagnose. See `INTENT.md` and `DeploymentBlueprint.md` for full context.
**Key hard-won knowledge going in:**
- `flake.nix` must carry the ActualTypes.hs export-list rewrite overlay to
prevent `.hi` overflow (Bug 1).
- `libHSghc-9.10.3-5702.a` on haskelseed may need a one-time patch if the
full 289 MB archive isn't already in place (Bug 2); check before build.
- `GHCRTS=-A32m -M2g` and `-j1` are mandatory on the 2-CPU/3.8 GB host.
---
## Tasks
### T01 — Adopt flake.nix from inter-hub baseline
```task
id: IRP-WP-0001-T01
status: todo
priority: high
```
Copy `flake.nix` from `inter-hub` as the starting point and strip it down to
the probe's minimal package set:
1. Copy `inter-hub/flake.nix``ihp-railiance-probe/flake.nix`
2. Change `appName` to `"ihp-railiance-probe"`
3. Remove packages not needed: `http-conduit`, `aeson`, `string-conversions`,
`cryptohash-sha256`, `base16-bytestring`, `random-bytestring`, `yaml`,
`network-uri` (add back only as features require them)
4. Keep the inter-hub-models `configureFlags` and `postUnpack` overlay verbatim
— these fix GHC 9.10.3 Bug 1 and are needed regardless of module count
5. Remove the `inter-hub-lib` overlay (it was a workaround that was superseded;
confirm it is absent from inter-hub's current flake before copying)
6. Commit the flake
**Exit criteria:** `nix flake check` passes (or `--no-build` if check is slow).
---
### T02 — Minimal IHP project scaffold
```task
id: IRP-WP-0001-T02
status: todo
priority: high
```
Bootstrap the IHP project skeleton inside the repo:
1. Verify Determinate Nix + `ihp-new` are available on the workstation
2. If the repo is empty (only README/LICENSE), run:
```bash
cd /home/worsch/ihp-railiance-probe
ihp-new . --name ihp-railiance-probe # or copy scaffold from inter-hub
```
Alternatively: copy the IHP scaffold from inter-hub and strip everything
down to bare bones (single-table schema, no domain modules).
3. Confirm `devenv up` starts: app on `:8000`, Postgres managed by Nix
4. Commit baseline scaffold
**Exit criteria:** `devenv up` succeeds; `http://localhost:8000` returns IHP
welcome page or a minimal home view.
---
### T03 — Minimal schema: `probes` table
```task
id: IRP-WP-0001-T03
status: todo
priority: high
```
Define one schema table in `Application/Schema.sql`:
```sql
CREATE TABLE probes (
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
name TEXT NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL
);
```
Steps:
1. Add table definition to `Application/Schema.sql`
2. Run `migrate` inside `devenv shell`
3. Trigger IHP code generation (IHP IDE at `:8001` → Schema tab → regenerate,
or `build-generated-code` inside devenv)
4. Commit migration + generated code
**Exit criteria:** `probes` table exists; generated `Generated/Types.hs` and
`Generated/ActualTypes.hs` are present in `build/`; `devenv up` still compiles.
---
### T04 — Health endpoint controller
```task
id: IRP-WP-0001-T04
status: todo
priority: high
```
Add a minimal `/healthz` route that returns `200 OK` with body `"ok"`:
1. Add route in `Web/Routes.hs`:
```haskell
instance CanRoute HealthController where
parseRoute' = do
pathPrefix "/healthz"
pure HealthAction
```
2. Add `HealthController` to `Web/Types.hs`
3. Implement controller in `Web/Controller/Health.hs`:
```haskell
action HealthAction = renderPlain "ok"
```
4. Wire into `Web/FrontController.hs`
5. Verify `curl http://localhost:8000/healthz` → `ok`
6. Commit
**Exit criteria:** `/healthz` returns `200 ok` in devenv.
---
### T05 — First Hspec integration test
```task
id: IRP-WP-0001-T05
status: todo
priority: high
```
Write the test *before* adding any Probes CRUD (test-first proof):
1. Add `Test/ProbeControllerSpec.hs`:
```haskell
module Test.ProbeControllerSpec where
import Test.Hspec
import IHP.HSpec
spec :: Spec
spec = describe "ProbeController" $ do
it "GET /probes returns 200" $ do
response <- get "/probes"
response `shouldRespondWith` 200
```
2. Wire into `Test/Main.hs`
3. Run `test` in devenv — test should **fail** (no `/probes` route yet)
4. Implement minimal `ProbesController` with `index` action returning an empty list
5. Run `test` again — should pass
6. Commit both the test and the controller together
**Exit criteria:** `test` exits 0; test report shows ProbeController spec green.
---
### T06 — Production build on haskelseed
```task
id: IRP-WP-0001-T06
status: todo
priority: high
```
First `nix build .#docker` on haskelseed for the probe:
**Pre-build checklist:**
```bash
# 1. Verify libHSghc-9.10.3-5702.a is full (should be 289,295,782 bytes)
wc -c /nix/store/ffg3yf2ypnbz3hc31y7nglrkihz0if01-ghc-9.10.3/lib/ghc-9.10.3/lib/x86_64-linux-ghc-9.10.3/ghc-9.10.3-5702/libHSghc-9.10.3-5702.a
# If ~287 MB, apply archive patch before proceeding (see HaskellVibePrimer.md §Bug 2)
# 2. Ensure source is on haskelseed
scp flake.nix + source tree → root@192.168.178.135:/root/ihp-railiance-probe/
# or: git push + git pull on haskelseed
```
Build steps:
```bash
sshpass -p 'hcs26!x' ssh root@192.168.178.135 \
'cd /root/ihp-railiance-probe && nix build .#docker --log-format raw \
> /tmp/probe-build01.log 2>&1 &'
```
Monitor with tail; expect 30-50 min on first build (no cache).
**Exit criteria:** `result` symlink present on haskelseed; `nix log` shows no errors.
---
### T07 — Push OCI image to Gitea registry
```task
id: IRP-WP-0001-T07
status: todo
priority: medium
```
Push the built image to the Gitea container registry:
```bash
sshpass -p 'hcs26!x' ssh root@192.168.178.135 \
'cd /root/ihp-railiance-probe && \
SHA=$(git rev-parse --short HEAD) && \
skopeo copy docker-archive:result \
docker://92.205.130.254:32166/coulomb/ihp-railiance-probe:$SHA'
```
Verify:
```bash
skopeo inspect docker://92.205.130.254:32166/coulomb/ihp-railiance-probe:<SHA>
```
**Exit criteria:** `skopeo inspect` succeeds; image visible in Gitea Packages UI.
---
### T08 — Helm chart
```task
id: IRP-WP-0001-T08
status: todo
priority: medium
```
Create a minimal Helm chart in `chart/`:
```
chart/
Chart.yaml # name: ihp-railiance-probe, version: 0.1.0
values.yaml # image.repository, image.tag, env vars
templates/
deployment.yaml # single replica, port 8000, envFrom secretRef
service.yaml # ClusterIP, port 80 → 8000
ingress.yaml # Traefik IngressRoute or standard Ingress
secret.yaml # IHP_SESSION_SECRET, DATABASE_URL, IHP_BASEURL
```
Key `deployment.yaml` notes:
- Image: `{{ .Values.image.repository }}:{{ .Values.image.tag }}`
- Repository default: `92.205.130.254:32166/coulomb/ihp-railiance-probe`
- `imagePullPolicy: Always`
- Resource limits: `memory: 256Mi`, `cpu: 200m` (probe is small)
- Liveness probe: `GET /healthz` after 30s initialDelay
Commit the chart.
**Exit criteria:** `helm lint chart/` passes.
---
### T09 — k3s registry configuration on Railiance01
```task
id: IRP-WP-0001-T09
status: todo
priority: medium
```
Configure k3s to pull from the HTTP (non-TLS) Gitea registry:
```bash
ssh railiance01
sudo cat /etc/rancher/k3s/registries.yaml
# If not present or missing the mirror entry, add:
```
```yaml
mirrors:
"92.205.130.254:32166":
endpoint:
- "http://92.205.130.254:32166"
```
```bash
sudo systemctl restart k3s
```
Verify: `sudo k3s crictl pull 92.205.130.254:32166/coulomb/ihp-railiance-probe:<SHA>`
**Exit criteria:** image pulls successfully on Railiance01.
---
### T10 — Deploy to Railiance01
```task
id: IRP-WP-0001-T10
status: todo
priority: medium
```
Deploy the probe to the `coulomb` namespace:
```bash
# Create namespace if not present
kubectl --context railiance01 create namespace coulomb --dry-run=client -o yaml | kubectl apply -f -
# Create/update secret
kubectl --context railiance01 -n coulomb create secret generic ihp-railiance-probe-env \
--from-literal=IHP_SESSION_SECRET="$(openssl rand -base64 32)" \
--from-literal=DATABASE_URL="postgresql://..." \
--from-literal=IHP_BASEURL="https://probe.coulomb.example" \
--dry-run=client -o yaml | kubectl apply -f -
# Deploy
helm --kube-context railiance01 upgrade --install ihp-railiance-probe ./chart \
--namespace coulomb \
--set image.tag=<SHA>
```
**Exit criteria:**
```bash
kubectl -n coulomb get pods | grep ihp-railiance-probe # Running
kubectl -n coulomb logs deploy/ihp-railiance-probe | tail -5 # IHP startup
curl http://<cluster-ip>/healthz # ok
```
---
### T11 — End-to-end smoke test
```task
id: IRP-WP-0001-T11
status: todo
priority: medium
```
Verify the full pipeline produced a live application:
1. `GET /healthz` → `200 ok` from outside the cluster (via Ingress or NodePort)
2. `GET /probes` → `200` (empty list, no crash)
3. No panic/crash in pod logs within 60 seconds of startup
4. Document the verified SHA and timestamp in a `PIPELINE_LOG.md` entry:
```
| 2026-05-02 | <SHA> | Build: haskelseed | Push: 92.205.130.254:32166 | Deploy: Railiance01 | Smoke: PASS |
```
**Exit criteria:** All three HTTP checks pass; log entry committed.
---
### T12 — Gitea Actions CI (optional, Phase 2)
```task
id: IRP-WP-0001-T12
status: todo
priority: low
```
Automate the build → push → deploy pipeline via Gitea Actions:
1. Register haskelseed as a Gitea Actions runner:
```bash
# On haskelseed:
act_runner register --instance http://92.205.130.254:32166 --token <runner-token> --name haskelseed
act_runner daemon &
```
2. Create `.gitea/workflows/build-and-deploy.yml`:
```yaml
on: [push]
jobs:
build:
runs-on: haskelseed
steps:
- uses: actions/checkout@v3
- run: nix build .#docker --log-format raw
- run: |
SHA=$(git rev-parse --short HEAD)
skopeo copy docker-archive:result \
docker://92.205.130.254:32166/coulomb/ihp-railiance-probe:$SHA
- run: |
SHA=$(git rev-parse --short HEAD)
helm upgrade --install ihp-railiance-probe ./chart \
--namespace coulomb --set image.tag=$SHA
```
3. Trigger a push; verify pipeline runs end-to-end
**Exit criteria:** CI pipeline runs without manual intervention on each push to `main`.
---
## Exit Criteria Summary
| Task | Check | Status |
|------|-------|--------|
| T01 | flake.nix with overlay from inter-hub | todo |
| T02 | `devenv up` → IHP welcome page | todo |
| T03 | `probes` table in DB; code-gen passes | todo |
| T04 | `/healthz` returns `200 ok` | todo |
| T05 | Hspec `test` exits 0 | todo |
| T06 | `nix build .#docker` on haskelseed succeeds | todo |
| T07 | Image visible in Gitea registry | todo |
| T08 | `helm lint chart/` passes | todo |
| T09 | k3s can pull from HTTP registry | todo |
| T10 | Pod Running on Railiance01 | todo |
| T11 | Smoke tests pass; log entry committed | todo |
| T12 | CI pipeline automated (optional) | todo |