First consumer of the shared apps-pg cluster: managed role vergabe in apps-pg-cluster.yaml plus Database CR vergabe-db in new helm/apps-pg-databases.yaml. .gitignore whitelists helm/*-databases.yaml. Workplan implementation notes from codex folded in. Live: Database CR applied=true, psql from vergabe-teilnahme ns returns PostgreSQL 16.13. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
15 KiB
id, type, title, domain, repo, status, owner, topic_slug, planning_priority, planning_order, created, updated, state_hub_workstream_id
| id | type | title | domain | repo | status | owner | topic_slug | planning_priority | planning_order | created | updated | state_hub_workstream_id |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RAILIANCE-WP-0003 | workplan | Provision shared CNPG cluster apps-pg | railiance | railiance-platform | finished | codex | railiance | high | 3 | 2026-05-19 | 2026-05-19 | 665b3b9b-608a-4be4-84b6-dcb8261ff57b |
RAILIANCE-WP-0003 - Provision shared CNPG cluster apps-pg
Goal
Provision a new shared CloudNativePG cluster apps-pg in the
databases namespace that S5 application workloads can use to host
their own PostgreSQL databases — without each app forcing the creation
of a dedicated CNPG cluster.
This unblocks railiance-apps RAILIANCE-WP-0002 T04 (vergabe-teilnahme
needs a vergabe role + vergabe_db database) by establishing the
shared cluster and the governed onboarding contract future S5 apps adopt
by default.
Context
railiance-apps workplan RAILIANCE-WP-0002 (establish
vergabe-teilnahme on railiance01) found at T01 that the two existing
CNPG clusters in databases are app-dedicated:
| Cluster | PG | Owner app |
|---|---|---|
gitea-db |
18 | gitea |
net-kingdom-pg |
16 | net-kingdom |
Decision D-01 (resolved 2026-05-18, bernd) selected option D:
provision a new shared cluster apps-pg rather than create a third
dedicated cluster (option A) or retrofit an existing app cluster (B/C).
A coordination message was sent from railiance-apps to
railiance-platform requesting this work; this workplan is the
response.
Placement in the Railiance Tooling Set
S3 owns CNPG Cluster CRs (per ADR-003 and the pattern already
established by helm/gitea-db-cluster.yaml). CNPG 1.28 has standalone
Database CRs, but PostgreSQL role lifecycle is managed through the
target Cluster spec's .spec.managed.roles stanza or through a
controlled operator-run SQL workflow. The shared-cluster contract must
therefore make role onboarding explicit; S5 repos should not assume a
standalone CNPG Role CR exists.
| Concern | Owner repo | Scope |
|---|---|---|
Cluster apps-pg CR, shared NetworkPolicies, bootstrap secret, baseline docs |
railiance-platform |
this workplan |
| Per-app database request and application DSN wiring | each S5 repo | not here |
| Per-app PostgreSQL role + credential provisioning | coordinated | documented here; platform-administered until OpenBao/dedicated automation exists |
| Per-app runtime Secret in the consumer namespace | each S5 repo | not here |
Current Evidence
kubectl get crd | grep cnpgconfirms CNPG 1.28.1 with thedatabases.postgresql.cnpg.ioCRD — databases can be represented declaratively.- CNPG role management is cluster-scoped via
.spec.managed.roles; no standalone CNPGRoleCR is available for app repos to apply. - Operator image:
ghcr.io/cloudnative-pg/cloudnative-pg:1.28.1(cnpg-systemnamespace). databasesnamespace has a default-deny-all NetworkPolicy; each CNPG cluster therefore needs its own NetworkPolicy triplet (egress-to-kube-api, ingress-from-cnpg-operator, ingress-from-app-ns) — pattern visible inhelm/gitea-db-networkpolicies.yaml.helm/apps-pg-cluster.yaml,helm/apps-pg-networkpolicies.yaml,helm/apps-pg-secret.sops.yaml.template, anddocs/apps-pg.mdare present in the repo.- Coordination message id:
768c18f4-8785-4108-a900-fa117eb8778f(state-hub thread).
Implementation Notes
Completed on 2026-05-19.
- CNPG operator is
ghcr.io/cloudnative-pg/cloudnative-pg:1.28.1. clusters.postgresql.cnpg.ioanddatabases.postgresql.cnpg.ioCRDs are present;roles.postgresql.cnpg.iois not present, so role onboarding remains platform-administered through managed roles or a controlled SQL workflow.local-pathis the default StorageClass. The single K3s node reports no memory, disk, or PID pressure; allocatable ephemeral storage is about 97.7 GB and memory is about 3.8 GiB. Existing CNPG PVC footprint beforeapps-pgwas two 10Gi PVCs (gitea-db-1,net-kingdom-pg-1).databasesexists withdefault-deny-all;cnpg-systemhas the requiredkubernetes.io/metadata.name=cnpg-systemnamespace label.- The live CNPG CRD rejected
spec.postgresql.version; the deployedapps-pgmanifest therefore pins PostgreSQL 16 withimageName: ghcr.io/cloudnative-pg/postgresql:16. apps-pgis deployed indatabases, reportsCluster in healthy state, and has primaryapps-pg-1.- Services
apps-pg-rwandapps-pg-roexist. With one instance,apps-pg-rois present but has no replica endpoint until HA is added. - A disposable namespace labeled
railiance.io/postgres-client=apps-pgsuccessfully connected toapps-pg-rw.databases.svc.cluster.local:5432/apps_metaasapps_admin; the temporary namespace and copied smoke-test secret were deleted immediately after the check.
Safety Contract
- Do not commit plaintext credentials. Bootstrap secret is a one-time
manual
kubectl create secretthen SOPS-encrypt a template intohelm/apps-pg-secret.sops.yaml.template. - Do not expose
apps_adminto S5 applications. It is a platform bootstrap/smoke-test role, not a runtime credential. - Do not collocate non-app data (Gitea, net-kingdom) into
apps-pg. This cluster is for S5 application DBs. - Preserve the default-deny NetworkPolicy posture in
databases; only allow ingress from namespaces that have a registered consumer. - Do not advertise self-service role creation until the role
provisioning mechanism is explicit. CNPG
DatabaseCRs still require their owner role to exist. - Initial sizing is conservative (1 instance, 10Gi) to match the existing per-cluster footprint. Resize is a follow-up workplan.
- Cluster name
apps-pgis locked once published — renaming changes every consumer DSN.
Target State
kubectl get cluster apps-pg -n databasesreportsCluster in healthy statewith the primaryapps-pg-1.kubectl get svc apps-pg-rw apps-pg-ro -n databasesexists.- NetworkPolicies for
apps-pgmirror thegitea-dbtriplet. make apps-pg-deploy / apps-pg-status / apps-pg-shell / apps-pg-logstargets exist and work.- Bootstrap admin role (
apps_admin) and meta database (apps_meta) exist for cluster health probes and to anchor the bootstrap; the cluster is otherwise empty of per-app data. - Documentation explains how an S5 consumer registers a new database,
including the current CNPG boundary: the
DatabaseCR is separate, but role lifecycle is cluster-scoped and therefore governed by the platform contract. railiance-appsis notified via the hub thread; theirRAILIANCE-WP-0002 T04can proceed using the documented onboarding path.
Tasks
T01 — Inventory and capacity check
id: RAILIANCE-WP-0003-T01
status: done
priority: high
state_hub_task_id: "37843f2f-0022-4725-ab07-29f6ae4c1749"
Confirm the substrate before adding a new cluster.
Checks:
- CNPG operator version (≥ 1.28.x required for the
DatabaseCR consumer pattern). - Role/database API boundary:
DatabaseCR is present; role lifecycle is.spec.managed.rolesor controlled SQL, not a separateRoleCR. - Node-level disk space available for an additional 10Gi PVC
(
local-pathstorage class is the active default). - Existing cluster footprint (
gitea-db,net-kingdom-pg) and any current resource pressure. - That the
databasesnamespace already exists and has its default-deny NetworkPolicy in place. - That
cnpg-systemnamespace labelkubernetes.io/metadata.name=cnpg-systemis set (required by the ingress-from-operator NetworkPolicy).
Done when: the implementation notes record CNPG version, available PVC capacity, the chosen role onboarding mechanism, and any pre-condition gaps.
T02 — Create bootstrap credential secret
id: RAILIANCE-WP-0003-T02
status: done
priority: high
state_hub_task_id: "b4777198-e42f-4ca1-b562-a595559fdf08"
Mint the one-time bootstrap secret that CNPG uses to create the initial
apps_admin role.
Steps:
APPS_PG_PW=$(openssl rand -base64 32)
kubectl create secret generic apps-pg-credentials \
--namespace databases \
--from-literal=username=apps_admin \
--from-literal=password="$APPS_PG_PW"
Then commit a SOPS-encrypted template:
helm/apps-pg-secret.sops.yaml.template— encrypted form for declarative reapply; do not commit the plaintext password.
The bootstrap role is intentionally not a consumer role. Per-app runtime roles are created later through the onboarding mechanism documented in T06; until dedicated automation exists, that mechanism is platform-administered.
Done when: the secret exists in the cluster and an encrypted template is committed.
T03 — Add the CNPG Cluster manifest
id: RAILIANCE-WP-0003-T03
status: done
priority: high
state_hub_task_id: "0840583d-23b2-4b93-9002-7977e6896a12"
Add helm/apps-pg-cluster.yaml modeled on helm/gitea-db-cluster.yaml.
Do not add app-specific roles or databases to the baseline cluster
manifest unless T01 explicitly chooses a platform-owned managed-role
stanza as the interim onboarding path for the first consumer.
Shape:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: apps-pg
namespace: databases
labels:
app.kubernetes.io/name: apps-pg
app.kubernetes.io/component: database
app.kubernetes.io/managed-by: manual
railiance.io/layer: s3-platform
railiance.io/role: shared-apps-database
spec:
instances: 1 # bump when node RAM > 8GB
imageName: ghcr.io/cloudnative-pg/postgresql:16
storage:
size: 10Gi
bootstrap:
initdb:
database: apps_meta
owner: apps_admin
secret:
name: apps-pg-credentials
Note: PG version is 16 (matches vergabe-teilnahme's minimum and the
existing net-kingdom-pg). Bumping to 17/18 is a separate decision.
Done when: the manifest is committed and kubectl apply --dry-run
validates against the cluster.
T04 — Add NetworkPolicies for apps-pg
id: RAILIANCE-WP-0003-T04
status: done
priority: high
state_hub_task_id: "7237f0f2-28e6-4eee-981b-06d0115cb0d1"
Add helm/apps-pg-networkpolicies.yaml modeled on the gitea-db triplet
but parameterised for the apps consumer namespace pattern.
Three policies (all in databases, all selecting
cnpg.io/cluster: apps-pg):
allow-egress-kube-api-apps-pg— egress to TCP/6443.allow-ingress-from-cnpg-operator-apps-pg— ingress fromnamespaceSelector kubernetes.io/metadata.name=cnpg-systemon TCP ports 5432 / 8000 / 9187.allow-ingress-from-app-namespaces-apps-pg— ingress on TCP/5432 from any namespace carrying the labelrailiance.io/postgres-client=apps-pg. (Each consuming app namespace adds this label; this avoids hard-coding a namespace list in the platform repo.)
The label-based selector is the meaningful difference from gitea-db,
which hard-codes default. The shared cluster cannot know its
consumer namespaces in advance, so it expects a positive opt-in label.
Done when: the policies are committed and applied; consumer namespaces
can connect after applying the railiance.io/postgres-client=apps-pg
label.
T05 — Makefile targets, deploy, verify
id: RAILIANCE-WP-0003-T05
status: done
priority: high
state_hub_task_id: "dc346e73-eadf-4eaa-8296-358df262f648"
Add targets that mirror the db-* (gitea-db) family:
apps-pg-deploy: ## Apply shared apps-pg CNPG Cluster + NetworkPolicies
$(KUBECTL) apply -f helm/apps-pg-cluster.yaml
$(KUBECTL) apply -f helm/apps-pg-networkpolicies.yaml
apps-pg-status: ## Show apps-pg CNPG cluster health
$(KUBECTL) cnpg status apps-pg -n databases 2>/dev/null || \
$(KUBECTL) get cluster apps-pg -n databases -o wide
apps-pg-shell: ## Open psql shell on apps-pg primary
$(KUBECTL) cnpg psql apps-pg -n databases -- -U apps_admin apps_meta
apps-pg-logs: ## Tail apps-pg primary logs
$(KUBECTL) logs -n databases -l cnpg.io/cluster=apps-pg -f --tail=50
Then deploy and wait for the cluster to converge:
make apps-pg-deploy
kubectl wait --for=condition=Ready cluster/apps-pg -n databases --timeout=5m
Smoke checks:
cnpg statusreportsCluster in healthy state.- Services
apps-pg-rwandapps-pg-roexist. - From a disposable pod in a temporary namespace labeled
railiance.io/postgres-client=apps-pg, a platform-operated test connection toapps-pg-rw.databases:5432/apps_metasucceeds. Delete the temporary namespace and any copied test secret immediately after the check; do not placeapps_adminin an application namespace.
Done when: the smoke checks pass.
T06 — Reply to railiance-apps, document the consumer contract
id: RAILIANCE-WP-0003-T06
status: done
priority: medium
state_hub_task_id: "8b78934d-0a3c-413c-a66f-295092282547"
Notify the requester and capture the pattern.
Steps:
- Reply to thread
768c18f4-8785-4108-a900-fa117eb8778fthrough the State Hub/messages/REST API with this workplan's id and the cluster's connection details. Do not send bootstrap credentials. - Add
docs/apps-pg.mdwith:- Cluster identity and connection endpoints.
- The per-app onboarding recipe: (a) request/approve a per-app role,
(b) provision the backing role and credential through the chosen
platform mechanism, (c) create the CNPG
DatabaseCR in thedatabasesnamespace withspec.cluster.name: apps-pgandspec.ownerset to the approved role, (d) label the consumer namespacerailiance.io/postgres-client=apps-pg, (e) publish or mirror the runtime Secret into the consumer namespace, and (f) wire the DSN into the application Helm values. - The CNPG 1.28 boundary:
Databaseis standalone; role management is not a standaloneRoleCR and must follow the platform contract. - Backup posture (when the cluster is added to the existing platform backup process) and the resize / replicate roadmap.
Done when: the message is replied to and docs/apps-pg.md is
committed.
Completion Criteria
This workplan is complete when:
apps-pgreports healthy in thedatabasesnamespace.- NetworkPolicies enforce the default-deny posture with label-based consumer opt-in.
- Makefile targets work end-to-end.
railiance-apps RAILIANCE-WP-0002 T04is unblocked and explicitly acknowledged via the hub thread.docs/apps-pg.mdexplains the consumer onboarding contract, including the CNPG role/database boundary.
Notes
- This intentionally does not hard-code the
vergaberole orvergabe_dbinto the shared cluster baseline. The consumer onboarding doc must describe the follow-up request/manifest needed forrailiance-appsso the platform layer stays generic until an app explicitly registers. - Backup inclusion of
apps-pgis a follow-up. The existingmake backuptarget only covers the legacy PostgreSQL-HA setup; CNPG backup configuration is its own workplan. - A second replica (HA) and a connection pooler (PgBouncer / CNPG
Pooler) are deferred. The cluster spec leaves room for both — re-enable when node capacity allows.