--- id: NK-WP-0003 type: workplan title: "KeyCape + privacyIDEA Stack — Cluster Deployment" domain: netkingdom repo: net-kingdom status: active owner: custodian topic_slug: netkingdom created: "2026-03-20" updated: "2026-03-20" state_hub_workstream_id: "f24cefd4-a09b-4fa1-9b25-94bf783b425e" --- # KeyCape + privacyIDEA Stack — Cluster Deployment ## Goal Deploy the full NetKingdom identity stack on the live k3s cluster without Keycloak. KeyCape (v0.1, complete) is the OIDC orchestration layer; it binds LLDAP (directory), Authelia (auth sessions), and privacyIDEA (MFA). NK-WP-0001 was scoped around Keycloak and is deferred. This workplan covers everything needed to reach a production-ready identity plane. ## Pre-conditions - [x] k3s cluster healthy — RAIL-BS-WP-0002 ✓ - [x] kubeconfig available at `~/.kube/config-hosteurope` — RAIL-BS-WP-0005 ✓ - [x] All manifests committed — net-kingdom `sso-mfa/k8s/` ✓ - [x] KeyCape v0.1 complete — KEY-WP-0001 ✓ - [x] SOPS + age integrated into net-kingdom — NK-WP-0004 ✓ - [x] Agent-driven credential bootstrap ready — NK-WP-0005 ✓ (run `make creds-agent-init`) ## Architecture ``` Internet → Traefik (k3s) → cert-manager TLS ├── auth.coulomb.social → Authelia ├── pink.coulomb.social → privacyIDEA portal └── id.coulomb.social → KeyCape (OIDC) KeyCape ──► Authelia (session, password) ──► LLDAP (directory, user lookup) ──► privacyIDEA (MFA challenges via trigger-admin token) privacyIDEA ──► PostgreSQL (privacyidea_db via CloudNativePG) LLDAP ──► PostgreSQL (lldap_db via CloudNativePG) Authelia ──► PostgreSQL (authelia_db via CloudNativePG) ``` ## Tasks ### T01 — Credential setup ```task id: NK-WP-0003-T01 status: done priority: high state_hub_task_id: "6a22e17e-5854-4f8b-b419-9dc86d490357" note: Superseded by NK-WP-0004 (credential foundation) and NK-WP-0005 (agent bootstrap). Run `make creds-agent-init` to execute fully automated bootstrap. The manual KeePassXC approach described here is retired — see canon/standards/credential-management_v0.2.md for the current model. ``` ~~Net-kingdom currently uses a manual KeePassXC + age-bundle approach~~ Completed via NK-WP-0004 + NK-WP-0005. The credential foundation is in place: - SOPS + age integrated — `~/.config/sops/age/keys.txt`, `.sops.yaml`, git hook - Agent bootstrap: `make creds-agent-init` runs the full flow autonomously - Credential standard: `canon/standards/credential-management_v0.2.md` To bootstrap credentials before T02–T09, run: ```bash make creds-agent-init ``` This generates all secrets, encrypts to `secrets.enc/`, injects into the cluster, and delivers the emergency bundle. No KeePassXC steps required. ### T02 — Apply cluster foundations ```task id: NK-WP-0003-T02 status: done priority: high state_hub_task_id: "a14e3a6b-18ee-4172-8a47-bd531f21e55a" note: Verified 2026-03-21 — all namespaces, NetworkPolicies, cert-manager, and ClusterIssuers already applied (35h+ ago). verify-t02.sh 22/22 passed. Fixed stale keycloak→keycape check in verify script. ``` Apply the K8s infrastructure foundations. All manifests already committed. ```bash export KUBECONFIG=~/.kube/config-hosteurope kubectl apply -f sso-mfa/k8s/namespaces/ kubectl apply -f sso-mfa/k8s/network-policies/ kubectl apply -f sso-mfa/k8s/cert-manager/ ``` Verify: `bash sso-mfa/k8s/verify-t02.sh` Expected: namespaces `sso`, `mfa`, `databases` exist; NetworkPolicies applied; cert-manager pods Running. ### T03 — Deploy PostgreSQL (CloudNativePG) ```task id: NK-WP-0003-T03 status: done priority: high state_hub_task_id: "19e375d0-66bd-4cf0-9c2d-59d5c0d5989e" note: Verified 2026-03-21 — CNPG cluster net-kingdom-pg healthy (1/1 Ready), privacyidea_db exists. LLDAP and Authelia use SQLite (PVC), no additional PG databases needed. verify-t03.sh: 8 PASS, 2 WARN (superuser secret + backup — both expected at this stage). ``` Deploy the shared database cluster with three databases: - `privacyidea_db` — privacyIDEA - `lldap_db` — LLDAP - `authelia_db` — Authelia ```bash kubectl apply -f sso-mfa/k8s/postgres/ ``` Wait for cluster to be `Ready`, then verify: `bash sso-mfa/k8s/verify-t03.sh` **Note**: Do not proceed to T04 until the CloudNativePG cluster is fully healthy. Migration jobs will fail on a partially-started cluster. ### T04 — Deploy privacyIDEA ```task id: NK-WP-0003-T04 status: done priority: high state_hub_task_id: "9c9c1ec9-0cf5-4546-a83e-d74dbf3b27af" note: Completed 2026-03-21 via make creds-agent-init (NK-WP-0005). Pod Running (ghcr.io/gpappsoft/privacyidea-docker:3.12.2, port 8080). enckey + audit keys extracted to K8s Secrets privacyidea-enckey/auditkeys. pi-admin and trigger-admin created. keycape-pi-token Secret in sso namespace. Remaining: TLS cert for pink.coulomb.social (ACME solver pods visible — T02 cert-manager needed). trigger-admin policy must be set manually via WebUI once pink.coulomb.social resolves. ``` Completed via `make creds-agent-init`. All Steps 1–4 were automated by the agent bootstrap. **Image fixes applied (2026-03-21):** - `privacyidea/otpserver:3.12.2` → `ghcr.io/gpappsoft/privacyidea-docker:3.12.2` (port 8080) - `PRIVACYIDEA_CONFIGFILE`, `PI_ADDRESS`, `PI_PORT` env vars added - Readiness probe changed to `tcpSocket` (`/token/` returns 401 for unauthenticated GET) **Remaining manual step:** Once `pink.coulomb.social` resolves to the cluster IP and TLS cert is issued: 1. Log in to https://pink.coulomb.social as `pi-admin` 2. Enroll MFA for `pi-admin` (TOTP) 3. Verify/create trigger-admin policy: Policies → trigger-admin-rights (Scope: admin, Action: triggerchallenge, AdminUser: trigger-admin) ### T05 — Deploy LLDAP ```task id: NK-WP-0003-T05 status: done priority: high state_hub_task_id: "82fc90f7-8eb4-4718-b02a-dfd5fa39e5bc" note: Deployed 2026-03-21. securityContext fix: removed runAsNonRoot/runAsUser (lldap image initialises as root). Pod 1/1 Running. Groups net-kingdom-users + net-kingdom-admins created via API (plaintext secrets dir cleaned up by agent; used K8s secret directly). ACME solver running for lldap.coulomb.social. ``` Deploy LLDAP into the `sso` namespace. ```bash cd sso-mfa/k8s/lldap bash create-secrets.sh kubectl apply -f deployment.yaml kubectl apply -f ingress.yaml kubectl apply -f middleware.yaml bash bootstrap-users.sh # creates base OU structure + initial admin user ``` Verify pod Running and LDAP bind works on `ldap.coulomb.social`. ### T06 — Deploy Authelia ```task id: NK-WP-0003-T06 status: done priority: high state_hub_task_id: "3a28ff10-fbfa-443b-a64d-bbfe6153c544" note: Deployed 2026-03-21. Two config fixes: (1) users_filter changed uid→{username_attribute}={input}; (2) OIDC client secret moved from unsupported env var to inline bcrypt hash in configmap (4.38 does not support CLIENTS_0_SECRET_FILE indexed env vars). Pod 1/1 Running, "Startup complete". Remaining deprecation warnings are auto-mapped and non-fatal. ``` Deploy Authelia into the `sso` namespace. ```bash cd sso-mfa/k8s/authelia bash create-secrets.sh kubectl apply -f configmap.yaml kubectl apply -f deployment.yaml kubectl apply -f ingress.yaml ``` Verify: `bash sso-mfa/k8s/verify-t05.sh` (covers LLDAP + Authelia together) ### T07 — Deploy KeyCape ```task id: NK-WP-0003-T07 status: done priority: high state_hub_task_id: "496a97c9-3e2a-486e-ba62-18449868c6cf" note: Completed 2026-03-22. KEY-WP-0002 delivered image to Gitea OCI registry (92.205.130.254:32166/coulomb/key-cape:latest). Three issues fixed: 1. deployment.yaml image ref updated to Gitea registry (correct namespace: coulomb) 2. k3s hosts.toml fixed: server endpoint must be http:// for plain-HTTP Gitea NodePort (k3s generated https:// by default → "http: server gave HTTP response to HTTPS client") 3. keycape-config clients: [] → added demo-app client (required for startup + T08 tests) Pod 1/1 Running; /healthz OK; OIDC discovery live. Note: hosts.toml at /var/lib/rancher/k3s/agent/etc/containerd/certs.d/92.205.130.254:32166/ is generated from /etc/rancher/k3s/registries.yaml — will revert on k3s restart. Permanent fix: registries.yaml mirror config generates HTTPS server by default; need to manually maintain hosts.toml or find k3s config that forces HTTP server. ``` Deploy KeyCape into the `sso` namespace. ```bash cd sso-mfa/k8s/keycape bash create-secrets.sh # includes privacyIDEA trigger-admin token bash create-pi-token.sh # registers KeyCape as a privacyIDEA application kubectl apply -f deployment.yaml kubectl apply -f ingress.yaml kubectl apply -f middleware.yaml ``` Verify: OIDC discovery endpoint reachable at `https://id.coulomb.social/.well-known/openid-configuration` ### T08 — End-to-end authentication test ```task id: NK-WP-0003-T08 status: done priority: high state_hub_task_id: "0fba3392-c916-43fd-a2c1-24ce39481043" note: Completed 2026-03-25. All 3 test packages pass (migration, negative, profile). Go 1.22.10 found at ~/go/bin/go. DNS resolves to 92.205.62.239 (all 4 subdomains). Tests run with: cd src && ~/go/bin/go test ./tests/... -v Results: ok keycape/tests/migration, ok keycape/tests/negative, ok keycape/tests/profile Note: tests use httptest.Server + mocks — no live cluster connection required. ``` Prove the full auth flow works: 1. OIDC discovery resolves at `id.coulomb.social` 2. Authelia password auth succeeds for a test user 3. privacyIDEA TOTP challenge issued and accepted 4. KeyCape issues a valid access token 5. Token introspection returns expected claims (sub, groups, email) Use the KeyCape acceptance test suite: ```bash cd "$(git rev-parse --show-toplevel)/../key-cape" go test ./tests/... -run TestProfileBaseline -v ``` ### T08a — Create Cloudflare DNS A records ```task id: NK-WP-0003-T08a status: done priority: high state_hub_task_id: "c614f839-61c4-41f6-bfeb-b3f9525a7625" note: DNS resolves 2026-03-25 — all 4 subdomains resolve to 92.205.62.239 via 8.8.8.8. (IP differs from workplan spec of 92.205.130.254 — cluster IP may have changed.) ``` Create 4 A records in Cloudflare DNS, **proxy disabled (DNS-only / orange cloud OFF)**, all pointing to `92.205.130.254`: | Subdomain | Type | Value | |-----------|------|-------| | `kc.coulomb.social` | A | `92.205.130.254` | | `auth.coulomb.social` | A | `92.205.130.254` | | `pink.coulomb.social` | A | `92.205.130.254` | | `lldap.coulomb.social` | A | `92.205.130.254` | HTTP-01 ACME challenges require direct origin reachability — Cloudflare proxy blocks this. Once DNS propagates, cert-manager's three pending challenges will auto-resolve and TLS certs will be issued for all four ingresses. Verify: `dig +short kc.coulomb.social @8.8.8.8` → `92.205.130.254` ### T08b — Install Go on CoulombCore ```task id: NK-WP-0003-T08b status: done priority: high state_hub_task_id: "fdfe595a-f5a8-466a-82e9-7cc2ad8e5c3e" note: Go 1.22.10 already installed at ~/go/bin/go. Tests run successfully against go 1.23 module. ``` Go is not installed on CoulombCore. Required for the KeyCape acceptance test suite (T08). ```bash wget https://go.dev/dl/go1.22.5.linux-amd64.tar.gz sudo tar -C /usr/local -xzf go1.22.5.linux-amd64.tar.gz echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc source ~/.bashrc go version # should print go1.22.5 ``` Verify: `cd ~/key-cape/src && go test ./tests/... -run TestProfileBaseline -v` ### T09 — Backup, DR, and monitoring ```task id: NK-WP-0003-T09 status: todo priority: medium state_hub_task_id: "a82751d8-4de8-4668-8568-8dc140a6322b" ``` Operational hardening: 1. Deploy backup CronJob for CloudNativePG → MinIO/S3 ```bash kubectl apply -f sso-mfa/k8s/backup/ ``` 2. Execute DB restore drill (mandatory before production traffic): restore `privacyidea_db` from a backup into a test namespace, verify privacyIDEA starts cleanly with the restored data 3. Deploy break-glass admin access (disabled by default): ```bash bash sso-mfa/k8s/lldap/break-glass.sh setup ``` 4. Verify Prometheus scraping for privacyIDEA and Authelia metrics 5. Confirm NetworkPolicies block all unexpected egress Verify: `bash sso-mfa/k8s/verify-t08.sh` (if exists) or manual checklist from NK-WP-0001 T08 scope. ## Done criteria - [x] Credentials: `bootstrap_complete: true` in `creds-state.yaml` (NK-WP-0005) - [ ] All verify-t*.sh scripts exit 0 - [x] KeyCape acceptance test suite passes - [ ] DB restore drill completed - [ ] Emergency bundle delivered and stored in personal password manager - [ ] Ops bundle stored offsite - [ ] privacyIDEA enckey backed up as K8s Secret (`privacyidea-enckey`) - [ ] Monitoring active (Prometheus scraping all three services)