First workplan for railiance-platform (S3). Separates platform services from the S2 cluster runtime layer per ADR-003: - T01: standalone PostgreSQL HA Helm chart (platform namespace) - T02: migrate Gitea to external DB, remove subchart coupling - T03: relocate Gitea Helm values to railiance-apps (S5) - T04: smoke + HA failover tests (D3 policy) - T05: relocate railiance-backup tool from S2 to S3 - T06: standalone Valkey deployment (enables Zulip reuse) Workstream: e4ec133c-7cb9-43c6-95f0-50d6591f13d7 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8.5 KiB
id, type, title, domain, repo, status, owner, topic_slug, state_hub_workstream_id, created, updated
| id | type | title | domain | repo | status | owner | topic_slug | state_hub_workstream_id | created | updated |
|---|---|---|---|---|---|---|---|---|---|---|
| RAIL-PL-WP-0001 | workplan | S3 Platform Services Baseline | railiance | railiance-platform | active | railiance | railiance | e4ec133c-7cb9-43c6-95f0-50d6591f13d7 | 2026-03-11 | 2026-03-11 |
S3 Platform Services Baseline
Goal
Establish railiance-platform (S3) as a reproducible, OAS-compliant platform
layer. Currently, PostgreSQL HA and Valkey are deployed implicitly as subcharts
of the Gitea Helm release in S2 (railiance-cluster). This violates the OAS
boundary rule: S3 owns platform services; S2 owns only the cluster runtime.
This workplan makes S3 a proper, standalone layer that S5 applications can depend on.
Scope
| Concern | Current location | After this workplan |
|---|---|---|
| PostgreSQL HA (repmgr + pgpool) | Gitea subchart in S2 | Standalone Helm release in S3 |
| Valkey (Redis-compatible cache) | Gitea subchart in S2 | Standalone Helm release in S3 |
| Gitea Helm values | railiance-cluster/helm/ (S2) |
railiance-apps/helm/ (S5) |
railiance-backup tool |
railiance-cluster/tools/cmd/ (S2) |
railiance-platform/tools/cmd/ (S3) |
Pre-conditions
railiance-clusterconverged: k3s running, Helm available (make smokepasses)- Active backup on Nextcloud before any migration step
- SSH tunnel active for State Hub MCP access
Boundary rule reminder (ADR-003)
S3 owns shared platform services. S5 owns application deployments. S2 must not manage database or cache services directly.
Tasks
T01 — Codify standalone PostgreSQL HA Helm chart
id: RAIL-PL-WP-0001-T01
state_hub_task_id: f5af95bf-3d2d-458a-b695-666d4dc2dc99
status: todo
priority: high
Write helm/postgresql-ha-values.sops.yaml using the Bitnami postgresql-ha
chart. Capture the values currently baked into the Gitea subchart, including
the pgpool-password fix from RAIL-BS-WP-0003:
# helm/postgresql-ha-values.sops.yaml (schema only — encrypt secrets with SOPS)
postgresql:
replicaCount: 3
password: ENC[...]
postgresPassword: ENC[...]
repmgrPassword: ENC[...]
pgpool:
replicaCount: 1
adminPassword: ENC[...]
# pgpool-password must be set — see RAIL-BS-WP-0003
pgpoolPassword: ENC[...]
persistence:
enabled: true
size: 10Gi
Add a make target:
pg-deploy: ## Deploy standalone PostgreSQL HA to cluster
helm upgrade --install postgresql-ha bitnami/postgresql-ha \
-f helm/postgresql-ha-values.yaml --namespace platform --create-namespace
pg-status: ## Check PostgreSQL HA pod status
kubectl get pods -n platform -l app.kubernetes.io/name=postgresql-ha
Add docs/postgresql-ha.md documenting:
- Chart version pinned
- Connection string pattern for apps
- How to create a new database for an app
- How to rotate passwords (SOPS re-encrypt → helm upgrade)
Done when: make pg-deploy succeeds; three postgresql-ha pods + pgpool
Running in the platform namespace; make smoke still passes.
T02 — Migrate Gitea to use external PostgreSQL
id: RAIL-PL-WP-0001-T02
state_hub_task_id: c1073011-935a-4c1a-9a9f-dc4db1fc3e88
status: todo
priority: high
Pre-condition: T01 done and postgresql-ha healthy in platform namespace.
Steps:
- Backup first:
make backupinrailiance-cluster— verify upload to Nextcloud. - Create a
giteadatabase and user on the new standalone cluster:kubectl exec -n platform postgresql-ha-postgresql-0 -- \ psql -U postgres -c "CREATE DATABASE gitea; CREATE USER gitea WITH PASSWORD '...'; GRANT ALL ON DATABASE gitea TO gitea;" - Migrate data:
pg_dumpfrom old DB →pg_restoreinto new cluster. - Update
helm/gitea-values.sops.yamlto disable the subchart and point to the external DB:postgresql-ha: enabled: false externalDatabase: host: postgresql-ha-pgpool.platform.svc.cluster.local port: 5432 database: gitea username: gitea password: ENC[...] helm upgrade gitea— verify Gitea operational.
Done when: Gitea login works; postgresql-ha subchart pods are gone;
all data intact.
T03 — Relocate Gitea Helm deployment to railiance-apps (S5)
id: RAIL-PL-WP-0001-T03
state_hub_task_id: a820cd02-0f30-4488-abf1-897120f1fbc1
status: todo
priority: medium
Pre-condition: T02 done.
# In railiance-cluster:
git mv helm/gitea-values.sops.yaml ../railiance-apps/helm/
Add to railiance-apps/Makefile:
gitea-deploy: ## Deploy / upgrade Gitea
helm upgrade --install gitea gitea-charts/gitea \
-f helm/gitea-values.yaml --namespace apps --create-namespace
gitea-status: ## Check Gitea pod status
kubectl get pods -n apps -l app.kubernetes.io/name=gitea
Add tombstone in railiance-cluster/helm/MOVED.md:
gitea-values.sops.yaml moved to railiance-apps/helm/ (2026-03-11, RAIL-PL-WP-0001-T03)
Update railiance-cluster/tests/smoke_kube.sh and tests/test_ha_failover.sh
to reference the new namespace (apps) if Gitea moves namespaces.
Done when: gitea-values.sops.yaml is in railiance-apps/helm/; Gitea
still operational; tombstone in place.
T04 — Smoke + HA failover tests pass post-migration
id: RAIL-PL-WP-0001-T04
state_hub_task_id: 8df4774c-5251-4c85-be57-61b903be82ee
status: todo
priority: high
Per Decision D3: no HA deployment is complete until the failover test exits 0.
# From railiance-cluster:
make smoke # all assertions green
make test-ha-failover GITEA_URL=https://<gitea-hostname>
Expected: pgpool recovers cleanly after primary pod deletion; Gitea login remains available within the recovery window.
Done when: both scripts exit 0 against the migrated live cluster.
T05 — Relocate railiance-backup tool from S2 to S3
id: RAIL-PL-WP-0001-T05
state_hub_task_id: 231f6f8a-97a0-4aa0-8318-8e4361af67a3
status: todo
priority: medium
As flagged in RAIL-HO-WP-0003 T04: backup is a platform concern (S3), not a cluster runtime concern (S2).
mkdir -p ~/railiance-platform/tools/cmd
git mv ~/railiance-cluster/tools/cmd/railiance-backup \
~/railiance-platform/tools/cmd/railiance-backup
Update railiance-platform/Makefile:
backup: ## Backup platform services (PostgreSQL, Valkey) — age-encrypted
sudo tools/cmd/railiance-backup
Add tombstone stub in railiance-cluster/tools/cmd/:
# railiance-backup — MOVED to railiance-platform/tools/cmd/ (RAIL-PL-WP-0001-T05)
Update railiance-cluster/Makefile backup target to delegate:
backup: ## Backup cluster runtime — delegates platform backup to railiance-platform
@echo "Cluster backup (etcd + kubeconfig):"
sudo tools/cmd/railiance-backup-s2
@echo "Platform backup (PostgreSQL, Valkey): run 'make backup' in railiance-platform"
Done when: make backup in railiance-platform runs the platform backup;
railiance-cluster backup still covers etcd/kubeconfig; no duplication.
T06 — Codify Valkey as standalone S3 asset
id: RAIL-PL-WP-0001-T06
state_hub_task_id: 20899c81-2b24-4d70-ad02-f6a1383b6811
status: todo
priority: low
Valkey is currently deployed as a Gitea subchart. Once T02 removes the subchart bundle, Valkey must be deployed independently so Gitea and future apps (Zulip) can use it.
Write helm/valkey-values.sops.yaml:
# Bitnami Valkey chart
auth:
enabled: true
password: ENC[...]
replica:
replicaCount: 1
persistence:
enabled: true
size: 2Gi
Add make targets:
valkey-deploy: ## Deploy Valkey (Redis-compatible) to platform namespace
helm upgrade --install valkey bitnami/valkey \
-f helm/valkey-values.yaml --namespace platform
valkey-status: ## Check Valkey pod status
kubectl get pods -n platform -l app.kubernetes.io/name=valkey
Done when: make valkey-deploy succeeds; Valkey Running in platform
namespace; Gitea reconnected to new Valkey endpoint.
References
- OAS Standard:
canon/standards/orthogonal-architecture_v1.0.md - ADR-003 (boundary rule):
railiance-infra/docs/adr/ADR-003-railiance-5repo-stack-architecture.md - RAIL-BS-WP-0003 (pgpool fix):
railiance-cluster/workplans/RAIL-BS-WP-0003-pgpool-ha-failover-fix.md - RAIL-HO-WP-0003 T04 (relocation table):
railiance-infra/workplans/RAIL-HO-WP-0003-5repo-stack-restructure.md - Decision D3 (HA testing policy):
railiance-cluster/DECISIONS.md - State Hub workstream:
e4ec133c-7cb9-43c6-95f0-50d6591f13d7