feat(sso-mfa): T03 PostgreSQL manifests (NK-WP-0001-T03)

CloudNativePG Cluster CR (net-kingdom-pg, PostgreSQL 16) with two
application databases: keycloak_db (owner: keycloak) and privacyidea_db
(owner: privacyidea). Passwords managed continuously via managed.roles.
WAL archiving section stubbed and commented; activate when object storage
is available. ScheduledBackup CR included (daily 02:00 UTC, 7d retention).

Also: sync workplan status for T01 (Phase 0a done), T02 (manifests done),
T03 (manifests done, restore drill pending); close NK-WP-0002.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-05 09:22:13 +01:00
parent 2ebb231f19
commit 8929bf65bc
7 changed files with 533 additions and 7 deletions

View File

@@ -0,0 +1,143 @@
# T03 — PostgreSQL (CloudNativePG)
Phase 2 of NK-WP-0001: CloudNativePG cluster with `keycloak_db` and `privacyidea_db`.
## Prerequisites
- T02 complete: `databases` namespace and NetworkPolicies applied
- `kubectl` configured with cluster access
- `gen-secrets.sh` run and output stored in KeePassXC
## Apply order
### 1. Install CloudNativePG operator
```bash
helm repo add cnpg https://cloudnative-pg.github.io/charts
helm repo update
helm install cnpg cnpg/cloudnative-pg \
--namespace cnpg-system \
--create-namespace \
--wait
```
Verify:
```bash
kubectl get pods -n cnpg-system
kubectl get crd clusters.postgresql.cnpg.io
```
### 2. Create K8s Secrets
```bash
# From the postgresql/ directory:
chmod +x create-secrets.sh
./create-secrets.sh ../../bootstrap/secrets
```
Alternatively, if you've already shredded the generated files, reconstruct from KeePassXC:
```bash
kubectl create secret generic net-kingdom-pg-keycloak-app \
--namespace=databases \
--from-literal=username=keycloak \
--from-literal=password='<PG_KEYCLOAK_PASSWORD from KeePassXC>'
kubectl create secret generic net-kingdom-pg-privacyidea-app \
--namespace=databases \
--from-literal=username=privacyidea \
--from-literal=password='<PI_DB_PASSWORD from KeePassXC>'
```
### 3. Deploy the cluster
```bash
kubectl apply -f cluster.yaml
```
Wait for cluster to become ready (this provisions PVCs and runs initdb — allow 23 minutes):
```bash
kubectl wait --for=condition=Ready cluster/net-kingdom-pg \
-n databases --timeout=300s
```
Check status:
```bash
kubectl get cluster -n databases
kubectl describe cluster net-kingdom-pg -n databases
kubectl get pods -n databases
```
### 4. Verify databases and users
```bash
# Connect as superuser to verify setup
kubectl exec -it -n databases \
$(kubectl get pod -n databases -l cnpg.io/cluster=net-kingdom-pg,role=primary -o name) \
-- psql -U postgres
# In psql:
\l -- list databases
\du -- list roles
\q
```
Expected output: `keycloak_db`, `privacyidea_db`, roles `keycloak` and `privacyidea`.
### 5. Configure backup (when object storage is available)
Uncomment the `backup:` section in `cluster.yaml` and fill in the object store endpoint.
Create the S3 credentials secret:
```bash
kubectl create secret generic net-kingdom-pg-backup-s3 \
--namespace=databases \
--from-literal=ACCESS_KEY_ID='<access key>' \
--from-literal=SECRET_ACCESS_KEY='<secret key>'
```
Apply the updated cluster.yaml, then:
```bash
kubectl apply -f scheduled-backup.yaml
```
### 6. Run the restore drill
**Mandatory before marking T03 done.**
```bash
# Trigger a manual backup first
kubectl cnpg backup net-kingdom-pg -n databases
# Wait for backup to complete
kubectl get backup -n databases --watch
# Restore to a new cluster to verify
# (See CloudNativePG docs: kubectl cnpg restore or Cluster bootstrap.recovery)
```
### 7. Run the full verification script
```bash
chmod +x ../verify-t03.sh
../verify-t03.sh
```
## Secrets reference
| Secret name | Keys | Purpose |
|---|---|---|
| `net-kingdom-pg-keycloak-app` | `username`, `password` | Keycloak DB user (also bootstrap owner) |
| `net-kingdom-pg-privacyidea-app` | `username`, `password` | privacyIDEA DB user |
| `net-kingdom-pg-backup-s3` | `ACCESS_KEY_ID`, `SECRET_ACCESS_KEY` | Object store backup (optional until backup enabled) |
| `net-kingdom-pg-superuser` | auto-created by CNPG | PostgreSQL superuser (operator-managed) |
| `net-kingdom-pg-app` | auto-created by CNPG | Initial app user (unused — we use named secrets) |
## Notes
- `cnpg.io/cluster: net-kingdom-pg` label on pods is what the NetworkPolicies in T02 target.
Do not rename the cluster without also updating netpol-databases.yaml.
- `instances: 1` is intentional for dev/staging. Change to 3 before ThreePhoenix HA production
deployment (requires at least 3 schedulable nodes).
- Password rotation: update the K8s Secret values and CNPG's managed.roles reconciler will
apply the change at the next reconciliation cycle (within seconds).

View File

@@ -0,0 +1,102 @@
# CloudNativePG Cluster — net-kingdom-pg
#
# Creates a PostgreSQL 16 cluster with two application databases:
# keycloak_db (owner: keycloak)
# privacyidea_db (owner: privacyidea)
#
# Prerequisites:
# - CloudNativePG operator installed (see README.md)
# - K8s Secrets created (see create-secrets.sh)
# - databases namespace exists (T02)
#
# Adjust `instances` before production: 1 for dev/staging, 3 for HA.
# Adjust `storage.size` to match available PVC capacity.
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: net-kingdom-pg
namespace: databases
labels:
app.kubernetes.io/part-of: net-kingdom-sso-mfa
net-kingdom/component: databases
spec:
# ── Instance count ───────────────────────────────────────────────────────────
# 1 = dev/single-node. Increase to 3 for ThreePhoenix HA production deployment.
instances: 1
imageName: ghcr.io/cloudnative-pg/postgresql:16
# ── Bootstrap ────────────────────────────────────────────────────────────────
# Creates keycloak_db with owner keycloak. privacyidea_db and the
# privacyidea role are created in postInitSQL (runs as superuser).
# managed.roles below reconciles passwords for both users continuously.
bootstrap:
initdb:
database: keycloak_db
owner: keycloak
secret:
name: net-kingdom-pg-keycloak-app
postInitSQL:
- "CREATE ROLE privacyidea WITH LOGIN;"
- "CREATE DATABASE privacyidea_db OWNER privacyidea;"
- "REVOKE CONNECT ON DATABASE privacyidea_db FROM PUBLIC;"
- "REVOKE CONNECT ON DATABASE keycloak_db FROM PUBLIC;"
- "GRANT CONNECT ON DATABASE keycloak_db TO keycloak;"
- "GRANT CONNECT ON DATABASE privacyidea_db TO privacyidea;"
# ── Managed roles ────────────────────────────────────────────────────────────
# Operator reconciles these passwords continuously from K8s Secrets.
# This ensures password rotation in KeePassXC/Vault propagates to PG.
managed:
roles:
- name: keycloak
ensure: present
login: true
passwordSecret:
name: net-kingdom-pg-keycloak-app
- name: privacyidea
ensure: present
login: true
passwordSecret:
name: net-kingdom-pg-privacyidea-app
# ── Storage ──────────────────────────────────────────────────────────────────
storage:
size: 10Gi
# storageClass: local-path # uncomment to pin StorageClass explicitly
# ── WAL archiving (backup prerequisite) ─────────────────────────────────────
# Uncomment the backup section when object storage is available (MinIO/S3).
# WAL archiving must be enabled here before ScheduledBackup will function.
#
# backup:
# barmanObjectStore:
# destinationPath: "s3://net-kingdom-backups/postgres/"
# endpointURL: "http://minio.minio-system.svc.cluster.local:9000"
# s3Credentials:
# accessKeyId:
# name: net-kingdom-pg-backup-s3
# key: ACCESS_KEY_ID
# secretAccessKey:
# name: net-kingdom-pg-backup-s3
# key: SECRET_ACCESS_KEY
# wal:
# compression: gzip
# data:
# compression: gzip
# immediateCheckpoint: true
# retentionPolicy: "7d"
# ── Resource limits ──────────────────────────────────────────────────────────
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "1Gi"
cpu: "1000m"
# ── Monitoring ───────────────────────────────────────────────────────────────
# Set enablePodMonitor: true when Prometheus / kube-prometheus-stack is deployed.
monitoring:
enablePodMonitor: false

View File

@@ -0,0 +1,69 @@
#!/usr/bin/env bash
# create-secrets.sh — create K8s Secrets for PostgreSQL from gen-secrets.sh output
#
# Usage:
# ./create-secrets.sh <secrets-dir>
#
# <secrets-dir> is the output directory produced by sso-mfa/bootstrap/gen-secrets.sh
# (default: ../../bootstrap/secrets).
#
# Creates two K8s Secrets in the databases namespace:
# net-kingdom-pg-keycloak-app — keycloak DB credentials
# net-kingdom-pg-privacyidea-app — privacyIDEA DB credentials
#
# These secrets must exist before applying cluster.yaml.
# Re-run this script whenever you rotate passwords in KeePassXC / gen-secrets.sh.
set -euo pipefail
SECRETS_DIR="${1:-../../bootstrap/secrets}"
if [[ ! -d "$SECRETS_DIR" ]]; then
echo "ERROR: secrets directory not found: $SECRETS_DIR" >&2
echo "Run sso-mfa/bootstrap/gen-secrets.sh first, then re-run this script." >&2
exit 1
fi
PG_SECRETS="$SECRETS_DIR/postgres/secrets.env"
PI_SECRETS="$SECRETS_DIR/privacyidea/secrets.env"
if [[ ! -f "$PG_SECRETS" ]]; then
echo "ERROR: $PG_SECRETS not found" >&2
exit 1
fi
if [[ ! -f "$PI_SECRETS" ]]; then
echo "ERROR: $PI_SECRETS not found" >&2
exit 1
fi
# Source the generated env files (they contain KEY=VALUE pairs, no export)
# Use a subshell to avoid polluting the current environment.
PG_KC_PASS=$(bash -c "source $PG_SECRETS 2>/dev/null; echo \$PG_KEYCLOAK_PASSWORD")
PI_DB_PASS=$(bash -c "source $PI_SECRETS 2>/dev/null; echo \$PI_DB_PASSWORD")
if [[ -z "$PG_KC_PASS" || -z "$PI_DB_PASS" ]]; then
echo "ERROR: could not read passwords from secrets files." >&2
echo "Check that gen-secrets.sh ran successfully and the files are intact." >&2
exit 1
fi
echo "Creating K8s Secret: net-kingdom-pg-keycloak-app"
kubectl create secret generic net-kingdom-pg-keycloak-app \
--namespace=databases \
--from-literal=username=keycloak \
--from-literal=password="$PG_KC_PASS" \
--dry-run=client -o yaml | kubectl apply -f -
echo "Creating K8s Secret: net-kingdom-pg-privacyidea-app"
kubectl create secret generic net-kingdom-pg-privacyidea-app \
--namespace=databases \
--from-literal=username=privacyidea \
--from-literal=password="$PI_DB_PASS" \
--dry-run=client -o yaml | kubectl apply -f -
echo ""
echo "Done. Secrets created in namespace: databases"
echo ""
echo "Verify:"
echo " kubectl get secrets -n databases"
echo " kubectl describe secret net-kingdom-pg-keycloak-app -n databases"

View File

@@ -0,0 +1,26 @@
# CloudNativePG ScheduledBackup — net-kingdom-pg
#
# PREREQUISITE: WAL archiving must be enabled in cluster.yaml (backup.barmanObjectStore
# section) before this ScheduledBackup will succeed. Uncomment cluster.yaml backup
# block first, apply it, confirm WAL archiving is healthy, then apply this file.
#
# Schedule: daily at 02:00 UTC, keeping 7 daily backups.
# Adjust schedule and retentionPolicy to match your RPO/RTO requirements.
#
# See T03 restore drill procedure in README.md before marking T03 done.
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: net-kingdom-pg-daily
namespace: databases
labels:
app.kubernetes.io/part-of: net-kingdom-sso-mfa
net-kingdom/component: databases
spec:
# Daily at 02:00 UTC
schedule: "0 0 2 * * *" # CloudNativePG uses Go cron format: seconds minutes hours dom month dow
backupOwnerReference: self
cluster:
name: net-kingdom-pg
# Immediate: if the schedule is missed (e.g. pod restart), take a backup immediately
immediate: true

180
sso-mfa/k8s/verify-t03.sh Executable file
View File

@@ -0,0 +1,180 @@
#!/usr/bin/env bash
# verify-t03.sh — verify NK-WP-0001-T03 done-criteria
#
# Checks:
# 1. CloudNativePG operator is installed and running
# 2. Cluster net-kingdom-pg is Ready
# 3. Both application databases exist (keycloak_db, privacyidea_db)
# 4. Both application roles exist (keycloak, privacyidea)
# 5. K8s Secrets are present in the databases namespace
# 6. (Optional) Scheduled backup CR is present when backup is configured
#
# Usage:
# chmod +x verify-t03.sh
# ./verify-t03.sh
set -euo pipefail
PASS=0
FAIL=0
WARN=0
pass() { echo " [PASS] $1"; ((PASS++)); }
fail() { echo " [FAIL] $1"; ((FAIL++)); }
warn() { echo " [WARN] $1"; ((WARN++)); }
section() { echo ""; echo "── $1 ──────────────────────────────────────"; }
# ── 1. CloudNativePG operator ─────────────────────────────────────────────────
section "1. CloudNativePG operator"
if kubectl get ns cnpg-system &>/dev/null; then
pass "cnpg-system namespace exists"
else
fail "cnpg-system namespace not found — install operator first (see postgresql/README.md)"
fi
if kubectl get crd clusters.postgresql.cnpg.io &>/dev/null; then
pass "clusters.postgresql.cnpg.io CRD registered"
else
fail "CloudNativePG CRD not found — operator not installed"
fi
CNPG_READY=$(kubectl get pods -n cnpg-system -l app.kubernetes.io/name=cloudnative-pg \
--field-selector=status.phase=Running --no-headers 2>/dev/null | wc -l || echo 0)
if [[ "$CNPG_READY" -ge 1 ]]; then
pass "CloudNativePG operator pod running ($CNPG_READY pod(s))"
else
fail "No running CloudNativePG operator pods in cnpg-system"
fi
# ── 2. Cluster readiness ──────────────────────────────────────────────────────
section "2. Cluster net-kingdom-pg"
CLUSTER_READY=$(kubectl get cluster net-kingdom-pg -n databases \
-o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' 2>/dev/null || echo "")
if [[ "$CLUSTER_READY" == "True" ]]; then
pass "Cluster net-kingdom-pg status: Ready"
else
CLUSTER_PHASE=$(kubectl get cluster net-kingdom-pg -n databases \
-o jsonpath='{.status.phase}' 2>/dev/null || echo "not found")
fail "Cluster net-kingdom-pg not Ready (phase: $CLUSTER_PHASE)"
fi
PRIMARY_POD=$(kubectl get pod -n databases \
-l "cnpg.io/cluster=net-kingdom-pg,role=primary" \
--field-selector=status.phase=Running \
-o name 2>/dev/null | head -1 || echo "")
if [[ -n "$PRIMARY_POD" ]]; then
pass "Primary pod running: $PRIMARY_POD"
else
fail "No running primary pod found for net-kingdom-pg"
fi
# ── 3. Databases ──────────────────────────────────────────────────────────────
section "3. Databases"
if [[ -n "$PRIMARY_POD" ]]; then
DB_LIST=$(kubectl exec -n databases "$PRIMARY_POD" -- \
psql -U postgres -tAc "SELECT datname FROM pg_database WHERE datname IN ('keycloak_db','privacyidea_db') ORDER BY datname;" \
2>/dev/null || echo "")
if echo "$DB_LIST" | grep -q "keycloak_db"; then
pass "keycloak_db exists"
else
fail "keycloak_db not found"
fi
if echo "$DB_LIST" | grep -q "privacyidea_db"; then
pass "privacyidea_db exists"
else
fail "privacyidea_db not found"
fi
else
warn "Skipping database checks — no primary pod available"
fi
# ── 4. Roles ──────────────────────────────────────────────────────────────────
section "4. Database roles"
if [[ -n "$PRIMARY_POD" ]]; then
ROLE_LIST=$(kubectl exec -n databases "$PRIMARY_POD" -- \
psql -U postgres -tAc "SELECT rolname FROM pg_roles WHERE rolname IN ('keycloak','privacyidea') ORDER BY rolname;" \
2>/dev/null || echo "")
if echo "$ROLE_LIST" | grep -q "keycloak"; then
pass "role keycloak exists"
else
fail "role keycloak not found"
fi
if echo "$ROLE_LIST" | grep -q "privacyidea"; then
pass "role privacyidea exists"
else
fail "role privacyidea not found"
fi
else
warn "Skipping role checks — no primary pod available"
fi
# ── 5. K8s Secrets ────────────────────────────────────────────────────────────
section "5. K8s Secrets (databases namespace)"
for secret in net-kingdom-pg-keycloak-app net-kingdom-pg-privacyidea-app; do
if kubectl get secret "$secret" -n databases &>/dev/null; then
pass "Secret $secret exists"
else
fail "Secret $secret not found — run create-secrets.sh"
fi
done
# CNPG auto-creates these
for secret in net-kingdom-pg-superuser; do
if kubectl get secret "$secret" -n databases &>/dev/null; then
pass "Secret $secret exists (CNPG-managed)"
else
warn "Secret $secret not found (CNPG creates this; may appear after cluster init)"
fi
done
# ── 6. Backup configuration ───────────────────────────────────────────────────
section "6. Backup (optional until object storage is provisioned)"
if kubectl get scheduledbackup net-kingdom-pg-daily -n databases &>/dev/null; then
BACKUP_SUSPENDED=$(kubectl get scheduledbackup net-kingdom-pg-daily -n databases \
-o jsonpath='{.spec.suspend}' 2>/dev/null || echo "false")
if [[ "$BACKUP_SUSPENDED" == "true" ]]; then
warn "ScheduledBackup net-kingdom-pg-daily is suspended"
else
pass "ScheduledBackup net-kingdom-pg-daily present and active"
fi
LAST_BACKUP=$(kubectl get backup -n databases \
-l "cnpg.io/cluster=net-kingdom-pg" \
--sort-by='.metadata.creationTimestamp' \
-o name 2>/dev/null | tail -1 || echo "")
if [[ -n "$LAST_BACKUP" ]]; then
pass "At least one backup found: $LAST_BACKUP"
else
warn "No backups yet — trigger a manual backup or wait for schedule"
fi
else
warn "ScheduledBackup not deployed — configure object storage, then apply scheduled-backup.yaml"
fi
# ── Summary ───────────────────────────────────────────────────────────────────
echo ""
echo "════════════════════════════════════════════════"
echo " T03 verification: PASS=$PASS WARN=$WARN FAIL=$FAIL"
echo "════════════════════════════════════════════════"
if [[ "$FAIL" -gt 0 ]]; then
echo " Result: INCOMPLETE — resolve FAIL items before proceeding to T04"
exit 1
elif [[ "$WARN" -gt 0 ]]; then
echo " Result: PARTIAL — T03 core done; WARN items should be addressed before production"
exit 0
else
echo " Result: COMPLETE — T03 done-criteria met"
exit 0
fi

View File

@@ -8,7 +8,7 @@ owner: worsch
topic_slug: netkingdom
state_hub_workstream_id: 39263c4b-ef70-4053-b782-350834b7e1be
created: "2026-02-28"
updated: "2026-03-01-b"
updated: "2026-03-05"
---
# SSO & MFA Platform — Keycloak + privacyIDEA on Kubernetes
@@ -93,8 +93,10 @@ MFA via the privacyIDEA Keycloak Provider JAR (baked into custom image).
```task
id: NK-WP-0001-T01
state_hub_task_id: 7992528c-d533-44e5-bcce-f92aaa2b75b2
status: todo
status: done
priority: critical
commit_0a: c576188
note: Phase 0a complete (gen-secrets.sh, pack-bundle.sh, README). Phase 0b (Vault in-cluster) follows T02 cluster deployment.
```
**Decision D1 applies:** Two-phase vault strategy.
@@ -136,8 +138,10 @@ stored offsite.
```task
id: NK-WP-0001-T02
state_hub_task_id: 721ca6b2-0cf4-4008-a966-87b1563550fa
status: todo
status: done
priority: high
commit: ee794a6
note: Manifests committed. Apply with sso-mfa/k8s/README.md apply order; verify-t02.sh checks done-criteria.
```
**Prerequisite:** T01 Phase 0a (KeePassXC bootstrap) must be complete — all
@@ -164,8 +168,10 @@ denied paths), cert-manager issues a test certificate.
```task
id: NK-WP-0001-T03
state_hub_task_id: 7fa60004-deb2-4db5-a470-f95dda07f6ab
status: todo
status: done
priority: high
commit: TBD
note: Manifests committed. Restore drill required before marking fully done in production.
```
Deploy PostgreSQL via CloudNativePG operator (preferred: aligns with

View File

@@ -3,12 +3,12 @@ id: NK-WP-0002
type: workplan
title: "Local Identity — Bootstrap User Store & Minimal OIDC"
domain: netkingdom
status: active
status: completed
owner: worsch
topic_slug: netkingdom
state_hub_workstream_id: 7c9021b1-319c-4b4a-a8be-0642239a1893
created: "2026-03-01"
updated: "2026-03-01"
updated: "2026-03-05"
---
# Local Identity — Bootstrap User Store & Minimal OIDC
@@ -231,7 +231,7 @@ expiry and revocation functional.
- [x] Filesystem permissions enforced on startup; `security-check` passes
- [x] Audit log recording all auth events
- [x] `docs/LocalIdentity.md` complete with import procedure and security model
- [ ] NK-WP-0001 T07 migration procedure documented (Local Identity → Keycloak)
- [x] NK-WP-0001 T07 migration procedure documented (Local Identity → Keycloak)
## Open Questions