generated from coulomb/repo-seed
Add CLAUDE.md, wiki protoplans, and NK-WP-0001 workplan
Initialises the net-kingdom project structure: - README.md: updated title and description - CLAUDE.md: project instructions and State Hub integration config - wiki/: three reference docs (NetKingdom overview, ChatGPT and Grok protoplans for the SSO/MFA platform) - workplans/NK-WP-0001-sso-mfa-platform.md: combined workplan (8 phases, 8 tasks) synthesised from the two protoplans; registered in the Custodian State Hub (workstream 39263c4b) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
361
workplans/NK-WP-0001-sso-mfa-platform.md
Normal file
361
workplans/NK-WP-0001-sso-mfa-platform.md
Normal file
@@ -0,0 +1,361 @@
|
||||
---
|
||||
id: NK-WP-0001
|
||||
type: workplan
|
||||
title: "SSO & MFA Platform — Keycloak + privacyIDEA on Kubernetes"
|
||||
domain: netkingdom
|
||||
status: active
|
||||
owner: worsch
|
||||
topic_slug: netkingdom
|
||||
state_hub_workstream_id: 39263c4b-ef70-4053-b782-350834b7e1be
|
||||
created: "2026-02-28"
|
||||
updated: "2026-02-28"
|
||||
---
|
||||
|
||||
# SSO & MFA Platform — Keycloak + privacyIDEA on Kubernetes
|
||||
|
||||
## Summary
|
||||
|
||||
Deploy a hardened SSO and MFA platform on Kubernetes: Keycloak as the
|
||||
OIDC/SAML identity provider, privacyIDEA as the MFA/token engine,
|
||||
integrated via the privacyIDEA Keycloak Provider. This is the foundational
|
||||
security layer for the net-kingdom DevSecOps platform.
|
||||
|
||||
## Context
|
||||
|
||||
Synthesised from two AI protoplans (wiki/WorkplanOneChatgpt.md and
|
||||
wiki/WorkplanOneGrok.md). Both sources converge on the same architecture;
|
||||
this plan picks the most concrete and production-aligned choices from each:
|
||||
|
||||
- **Single-credential bootstrap** (Grok) — one master secret unlocks the
|
||||
vault; all other credentials are vault-managed and never typed manually.
|
||||
- **Phase structure** (ChatGPT) — eight sequential phases reducing blast
|
||||
radius at each step.
|
||||
- **Tooling choices** (both) — Keycloak Operator or codecentric Helm,
|
||||
gpappsoft privacyIDEA Helm, CloudNativePG for PostgreSQL, cert-manager
|
||||
for TLS, Traefik as ingress (K3s native, aligned with Railiance).
|
||||
- **Custom Keycloak image** (both) — JAR baked into image via `kc.sh build`
|
||||
rather than `kubectl cp`; clean GitOps pattern.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Internet
|
||||
│ TLS (cert-manager / Let's Encrypt)
|
||||
┌──────┴──────┐
|
||||
│ Traefik │ (K3s native ingress)
|
||||
└──┬───────┬──┘
|
||||
│ │
|
||||
keycloak.… pi.… pi-account.…
|
||||
│ │ │
|
||||
┌──────┘ ┌────┘ │
|
||||
▼ ▼ │
|
||||
[Keycloak] [privacyIDEA]◄──┘ (self-service portal)
|
||||
│ │
|
||||
└────┬────┘
|
||||
▼
|
||||
[PostgreSQL] (CloudNativePG, namespace: databases)
|
||||
│
|
||||
[Vault / K8s Secrets] ← single credential unlocks
|
||||
```
|
||||
|
||||
**Namespaces:** `sso` (Keycloak), `mfa` (privacyIDEA), `databases`
|
||||
|
||||
**Integration:** Keycloak runs the browser login flow; privacyIDEA provides
|
||||
MFA via the privacyIDEA Keycloak Provider JAR (baked into custom image).
|
||||
|
||||
## Dependencies
|
||||
|
||||
- Depends on: `railiance/three-phoenix-ha-cluster` — full production
|
||||
deployment targets the ThreePhoenix K3s HA cluster. Development/staging
|
||||
can proceed on a single-node k3s instance.
|
||||
- Depends on: `railiance/phase-0-operational-baseline` — cert-manager, TLS,
|
||||
backup strategy must be operational before going live.
|
||||
|
||||
## Tasks
|
||||
|
||||
### T01 — Phase 0: Vault & secret bootstrap (single-credential principle)
|
||||
|
||||
```task
|
||||
id: NK-WP-0001-T01
|
||||
state_hub_task_id: 7992528c-d533-44e5-bcce-f92aaa2b75b2
|
||||
status: todo
|
||||
priority: critical
|
||||
```
|
||||
|
||||
Create the vault (KeePassXC .kdbx or self-hosted Bitwarden; HashiCorp Vault
|
||||
for later production hardening). Generate and store all secrets inside the
|
||||
vault — never typed again:
|
||||
|
||||
- privacyIDEA: `SECRET_KEY` (64+ chars), `PI_PEPPER` (32+ chars),
|
||||
`PI_ENCFILE` content (`pi-manage create_enckey`).
|
||||
- PostgreSQL: root + `keycloak` + `privacyidea` user passwords.
|
||||
- Keycloak: admin bootstrap secret + DB password.
|
||||
- TLS: ACME account key (if not delegated fully to cert-manager).
|
||||
- Break-glass: admin credentials + offline recovery OTP seed.
|
||||
|
||||
Export an age-encrypted ops bundle (encrypted tar of all secret YAML
|
||||
manifests). Enable K8s encryption-at-rest. Confirm secret injection
|
||||
strategy: External Secrets Operator + Vault backend, or sops/age for GitOps.
|
||||
|
||||
**Done when:** vault created, all secrets generated, encrypted ops bundle
|
||||
exported and stored offsite. Secret injection strategy decided.
|
||||
|
||||
---
|
||||
|
||||
### T02 — Phase 1: K8s foundations (namespaces, NetworkPolicies, cert-manager)
|
||||
|
||||
```task
|
||||
id: NK-WP-0001-T02
|
||||
state_hub_task_id: 721ca6b2-0cf4-4008-a966-87b1563550fa
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Create namespaces: `sso`, `mfa`, `databases`. Verify cert-manager is
|
||||
installed and functional on the K3s cluster (Traefik ingress). Define and
|
||||
apply NetworkPolicies to prevent lateral movement:
|
||||
|
||||
- Only ingress controller reaches Keycloak/privacyIDEA service ports.
|
||||
- Only Keycloak pods call the privacyIDEA API.
|
||||
- Only app pods/ingress reach Keycloak.
|
||||
- DB pods reachable only from `sso` and `mfa` namespaces.
|
||||
|
||||
Verify StorageClass for PVCs.
|
||||
|
||||
**Done when:** namespaces exist, NetworkPolicies applied and tested (verify
|
||||
denied paths), cert-manager issues a test certificate.
|
||||
|
||||
---
|
||||
|
||||
### T03 — Phase 2: PostgreSQL deployment (Keycloak + privacyIDEA DBs)
|
||||
|
||||
```task
|
||||
id: NK-WP-0001-T03
|
||||
state_hub_task_id: 7fa60004-deb2-4db5-a470-f95dda07f6ab
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Deploy PostgreSQL via CloudNativePG operator (preferred: aligns with
|
||||
ThreePhoenix HA posture) or Bitnami Helm chart as fallback. Create:
|
||||
|
||||
- Database `keycloak_db`, user `keycloak`
|
||||
- Database `privacyidea_db`, user `privacyidea`
|
||||
|
||||
Store DB credentials as K8s Secrets (or ExternalSecrets from vault).
|
||||
Configure automated DB backups to object storage (S3 or MinIO).
|
||||
**Run a restore drill before proceeding** — a failed restore later is a
|
||||
critical blocker.
|
||||
|
||||
**Done when:** both DBs live, credentials in K8s Secrets, backup running,
|
||||
restore drill passed.
|
||||
|
||||
---
|
||||
|
||||
### T04 — Phase 3: Deploy privacyIDEA (MFA core)
|
||||
|
||||
```task
|
||||
id: NK-WP-0001-T04
|
||||
state_hub_task_id: 6ad1296a-a488-4031-b665-f77030e971ed
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Deploy privacyIDEA via `gpappsoft/privacyidea` Helm chart (Artifact Hub) or
|
||||
custom manifests (Deployment + Service + Ingress + PVC + Secrets). Key
|
||||
Helm values:
|
||||
|
||||
```yaml
|
||||
database:
|
||||
password: <from-vault>
|
||||
privacyidea:
|
||||
config:
|
||||
SECRET_KEY: <from-vault>
|
||||
PI_PEPPER: <from-vault>
|
||||
encfile:
|
||||
enabled: true
|
||||
existingSecret: privacyidea-secrets
|
||||
key: PI_ENCFILE
|
||||
ingress:
|
||||
enabled: true
|
||||
hostname: pi.yourdomain.com
|
||||
tls: true
|
||||
```
|
||||
|
||||
Create K8s Secrets: `privacyidea-config`, `privacyidea-enckey`,
|
||||
`privacyidea-auditkeys`. Configure Ingress + TLS. Add rate-limiting and
|
||||
WAF rules at Traefik level.
|
||||
|
||||
**Bootstrap (single-credential moment):**
|
||||
1. `kubectl exec` into pod, run `pi-manage admin add pi-admin` — password
|
||||
comes from vault (only time a password is typed).
|
||||
2. Immediately enroll MFA for `pi-admin` (TOTP or hardware token).
|
||||
3. Create `trigger-admin` with `triggerchallenge` right only.
|
||||
4. Apply policies: WebUI restricted to VPN/office IPs; MFA required for
|
||||
all admin actions.
|
||||
|
||||
**Done when:** privacyIDEA reachable at pi.yourdomain.com with valid TLS,
|
||||
pi-admin enrolled with MFA, trigger-admin created, rate-limiting active.
|
||||
|
||||
---
|
||||
|
||||
### T05 — Phase 4: Deploy Keycloak (SSO core)
|
||||
|
||||
```task
|
||||
id: NK-WP-0001-T05
|
||||
state_hub_task_id: b9f73aa6-9035-4643-9905-64e73a29b298
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Build a **custom Keycloak image** that includes the privacyIDEA Provider JAR:
|
||||
|
||||
```dockerfile
|
||||
FROM quay.io/keycloak/keycloak:<version>
|
||||
COPY PrivacyIDEA-Provider.jar /opt/keycloak/providers/
|
||||
RUN /opt/keycloak/bin/kc.sh build
|
||||
```
|
||||
|
||||
Deploy via official Keycloak Operator (CRD-based) or codecentric KeycloakX
|
||||
Helm chart. Configure:
|
||||
|
||||
- DB: `keycloak_db` (credentials from K8s Secret)
|
||||
- Ingress + TLS: `keycloak.yourdomain.com` (Traefik + cert-manager)
|
||||
- Hostname strictness + proxy mode (Traefik forward headers)
|
||||
- Metrics/logging (Prometheus annotations)
|
||||
- Admin bootstrap secret from vault
|
||||
- Realm import strategy: GitOps-friendly (realm JSON in git or CR)
|
||||
|
||||
**Done when:** Keycloak reachable with valid TLS, admin console accessible,
|
||||
custom image with privacyIDEA JAR deployed and verified.
|
||||
|
||||
---
|
||||
|
||||
### T06 — Phase 5: Realm config & MFA authentication flow
|
||||
|
||||
```task
|
||||
id: NK-WP-0001-T06
|
||||
state_hub_task_id: 3b6379a4-a27b-4d25-82be-bc600879f036
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
In Keycloak:
|
||||
|
||||
1. Create/configure realm; set identity source of truth (Keycloak internal
|
||||
users recommended for initial deployment; LDAP/AD or Entra as extension).
|
||||
2. Create Authentication Flow "privacyIDEA Browser":
|
||||
- Add privacyIDEA execution step (REQUIRED)
|
||||
- Config: privacyIDEA URL = `https://pi.yourdomain.com`, service account
|
||||
= `trigger-admin` (secret from K8s Secret)
|
||||
- Optional: bypass group (break-glass) with strict restrictions + alerts
|
||||
3. Set this flow as the default browser flow.
|
||||
4. Require MFA step-up for admin console and sensitive OIDC clients.
|
||||
|
||||
Test:
|
||||
- Normal user: password → MFA OTP → session established
|
||||
- Admin console: MFA required
|
||||
- Failure modes: wrong OTP, token missing, privacyIDEA unreachable
|
||||
- Break-glass: bypass works, alert fires
|
||||
|
||||
**Done when:** end-to-end auth works for normal and admin paths, all failure
|
||||
modes handled gracefully.
|
||||
|
||||
---
|
||||
|
||||
### T07 — Phase 6: User management, policies & self-service portal
|
||||
|
||||
```task
|
||||
id: NK-WP-0001-T07
|
||||
state_hub_task_id: c7cf902a-b480-4545-a536-293070945206
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Decide and implement identity source of truth (Keycloak internal →
|
||||
privacyIDEA Keycloak resolver, or LDAP/AD shared). The privacyIDEA 3.12+
|
||||
Keycloak user resolver simplifies alignment.
|
||||
|
||||
Define policies in privacyIDEA:
|
||||
- Allowed token types: TOTP, hardware (YubiKey), passkey
|
||||
- Enrollment rules (who can self-enroll, which token types)
|
||||
- Admin rights separation: super-admin vs. helpdesk-admin
|
||||
|
||||
Enable self-service portal at `pi-account.yourdomain.com` for user token
|
||||
enrollment/replacement.
|
||||
|
||||
Configure auditing and log shipping: privacyIDEA audit logs + Keycloak
|
||||
events → centralized logging (ELK/Loki or equivalent). Token lifecycle
|
||||
policies: enrollment, revocation, re-enrollment on device loss.
|
||||
|
||||
**Done when:** policies documented and applied, self-service portal live,
|
||||
audit logs flowing.
|
||||
|
||||
---
|
||||
|
||||
### T08 — Phase 7: Backups, DR, break-glass & monitoring
|
||||
|
||||
```task
|
||||
id: NK-WP-0001-T08
|
||||
state_hub_task_id: 9cbd1d89-b5bf-491e-9d16-b1c7d57076fb
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
**Backups:**
|
||||
- DB backups: Keycloak + privacyIDEA (Velero or CloudNativePG scheduled
|
||||
backup to S3/MinIO). Test restore.
|
||||
- privacyIDEA encryption/audit key Secrets: encrypted export, versioned.
|
||||
- Keycloak realm exports: stored as JSON in git (GitOps-friendly).
|
||||
|
||||
**Disaster recovery drill** (mandatory before production):
|
||||
1. Restore DB + keys into a fresh namespace.
|
||||
2. Verify token validation still works — this catches key/secret mistakes.
|
||||
|
||||
**Break-glass procedure:**
|
||||
- Disabled-by-default Keycloak admin path or group exemption.
|
||||
- Break-glass credentials stored offline + vault. Alert (PagerDuty/webhook)
|
||||
on every use.
|
||||
|
||||
**Monitoring:**
|
||||
- Prometheus scraping Keycloak + privacyIDEA metrics.
|
||||
- Grafana dashboards: auth success/failure rates, MFA challenge latency,
|
||||
token count by type.
|
||||
- Alert: privacyIDEA unreachable (blocks all logins).
|
||||
|
||||
**Final validation:**
|
||||
- All external traffic: Ingress + HSTS + strict TLS.
|
||||
- NetworkPolicies verified (no unintended open paths).
|
||||
- End-to-end: app → Keycloak → privacyIDEA OTP → SSO session established.
|
||||
|
||||
**Done when:** DR drill passed, monitoring live, break-glass procedure
|
||||
documented and tested, HSTS and NetworkPolicies verified.
|
||||
|
||||
---
|
||||
|
||||
## Deliverables Checklist
|
||||
|
||||
- [ ] Vault created; all secrets generated and encrypted ops bundle exported
|
||||
- [ ] `sso`, `mfa`, `databases` namespaces + NetworkPolicies deployed
|
||||
- [ ] TLS everywhere via cert-manager (Traefik ingress)
|
||||
- [ ] PostgreSQL live; both DBs created; backup + restore tested
|
||||
- [ ] privacyIDEA running at `pi.yourdomain.com`; pi-admin MFA enrolled;
|
||||
trigger-admin created with least-privilege rights
|
||||
- [ ] Keycloak running from custom image including privacyIDEA Provider JAR
|
||||
- [ ] Keycloak "privacyIDEA Browser" flow enforced as default
|
||||
- [ ] Realm exported to git; admin secret from vault
|
||||
- [ ] Self-service portal live; token lifecycle policies defined
|
||||
- [ ] DR drill passed; monitoring live; break-glass documented and tested
|
||||
|
||||
## Open Questions / Extension Points
|
||||
|
||||
- **Vault backend**: KeePassXC (simple) vs HashiCorp Vault in-cluster
|
||||
(rotation, audit trail). Start with KeePassXC; upgrade to Vault when
|
||||
ThreePhoenix cluster is stable.
|
||||
- **Identity source of truth**: Keycloak-internal vs LDAP/AD/Entra.
|
||||
Decision needed before T07.
|
||||
- **GitOps tooling**: ArgoCD or Flux for declarative Helm management?
|
||||
Aligns with Railiance staged-promotion-lifecycle workstream.
|
||||
- **Cluster target**: Development on single-node k3s; production on
|
||||
ThreePhoenix (3-node HA). Workplan covers both; HA-specific steps noted
|
||||
where they diverge.
|
||||
Reference in New Issue
Block a user