generated from coulomb/repo-seed
docs(workplan): update NK-WP-0001 with resolved decisions D1/D2/D3
- Add Decisions table summarising D1 (KeePassXC→Vault), D2 (Keycloak-internal hybrid + file-based bootstrap), D3 (plain Helm, AI-first philosophy) - Split T01 into Phase 0a (pre-cluster KeePassXC) and Phase 0b (in-cluster Vault transition) per D1 - Update T05 to explicitly reference D3 (plain Helm first) - Update T06 to state the D2 identity decision rather than re-opening it - Update T07: remove "decide" language, implement decided approach, add D2 bootstrap user management scope note - Update T08: add Vault unseal key backup to the backup list - Replace Open Questions with remaining unresolved items (5 items) - Add DECISIONS.md (decision log auto-generated by State Hub) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -8,7 +8,7 @@ owner: worsch
|
||||
topic_slug: netkingdom
|
||||
state_hub_workstream_id: 39263c4b-ef70-4053-b782-350834b7e1be
|
||||
created: "2026-02-28"
|
||||
updated: "2026-02-28"
|
||||
updated: "2026-03-01"
|
||||
---
|
||||
|
||||
# SSO & MFA Platform — Keycloak + privacyIDEA on Kubernetes
|
||||
@@ -36,6 +36,17 @@ this plan picks the most concrete and production-aligned choices from each:
|
||||
- **Custom Keycloak image** (both) — JAR baked into image via `kc.sh build`
|
||||
rather than `kubectl cp`; clean GitOps pattern.
|
||||
|
||||
## Decisions
|
||||
|
||||
All three pending decisions from the first session have been resolved
|
||||
(2026-03-01, decided by Tegwick). Full rationale in `DECISIONS.md`.
|
||||
|
||||
| ID | Decision | Outcome |
|
||||
|----|----------|---------|
|
||||
| D1 | Vault backend | **KeePassXC pre-cluster → HashiCorp Vault in-cluster.** Bootstrap on KeePassXC before a cluster is available; transition to Vault once K3s is operational. |
|
||||
| D2 | Identity source of truth | **Hybrid: Keycloak-internal + LDAP/Entra federation** for enterprise tier. Plus a **file-based bootstrap** user store for pre-Keycloak dev/test/sandbox systems. |
|
||||
| D3 | GitOps tooling | **Plain Helm to start, upgrade to Flux when warranted.** Development philosophy: AI-first (TDD, API-first/headless, MCP layer, CLI tooling; UI is low-priority and lives in separate repos). |
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
@@ -55,7 +66,8 @@ this plan picks the most concrete and production-aligned choices from each:
|
||||
▼
|
||||
[PostgreSQL] (CloudNativePG, namespace: databases)
|
||||
│
|
||||
[Vault / K8s Secrets] ← single credential unlocks
|
||||
[HashiCorp Vault] ← single credential unlocks (in-cluster)
|
||||
[KeePassXC] ← pre-cluster bootstrap / dev/test/sandbox
|
||||
```
|
||||
|
||||
**Namespaces:** `sso` (Keycloak), `mfa` (privacyIDEA), `databases`
|
||||
@@ -82,9 +94,13 @@ status: todo
|
||||
priority: critical
|
||||
```
|
||||
|
||||
Create the vault (KeePassXC .kdbx or self-hosted Bitwarden; HashiCorp Vault
|
||||
for later production hardening). Generate and store all secrets inside the
|
||||
vault — never typed again:
|
||||
**Decision D1 applies:** Two-phase vault strategy.
|
||||
|
||||
**Phase 0a — Pre-cluster KeePassXC bootstrap (do this first, before K8s):**
|
||||
|
||||
Create a KeePassXC `.kdbx` database as the initial secret store. Keep the
|
||||
KeePassXC master password in a personal password manager. Generate and store
|
||||
all bootstrap secrets inside KeePassXC:
|
||||
|
||||
- privacyIDEA: `SECRET_KEY` (64+ chars), `PI_PEPPER` (32+ chars),
|
||||
`PI_ENCFILE` content (`pi-manage create_enckey`).
|
||||
@@ -94,11 +110,20 @@ vault — never typed again:
|
||||
- Break-glass: admin credentials + offline recovery OTP seed.
|
||||
|
||||
Export an age-encrypted ops bundle (encrypted tar of all secret YAML
|
||||
manifests). Enable K8s encryption-at-rest. Confirm secret injection
|
||||
strategy: External Secrets Operator + Vault backend, or sops/age for GitOps.
|
||||
manifests). Store offsite.
|
||||
|
||||
**Done when:** vault created, all secrets generated, encrypted ops bundle
|
||||
exported and stored offsite. Secret injection strategy decided.
|
||||
**Phase 0b — HashiCorp Vault in-cluster (after T02, once K3s is running):**
|
||||
|
||||
Deploy HashiCorp Vault in the cluster (Helm chart). Migrate secrets from
|
||||
KeePassXC into Vault. Enable K8s encryption-at-rest. Choose and implement
|
||||
secret injection strategy: External Secrets Operator + Vault backend, or
|
||||
Vault Agent Injector (ESO preferred for GitOps alignment). KeePassXC
|
||||
remains the source of truth for dev/test/sandbox systems that do not connect
|
||||
to the cluster Vault.
|
||||
|
||||
**Done when:** KeePassXC created and all secrets generated (0a). Vault
|
||||
deployed in-cluster, secrets migrated, injection strategy operational (0b).
|
||||
Encrypted ops bundle exported and stored offsite.
|
||||
|
||||
---
|
||||
|
||||
@@ -142,7 +167,8 @@ ThreePhoenix HA posture) or Bitnami Helm chart as fallback. Create:
|
||||
- Database `keycloak_db`, user `keycloak`
|
||||
- Database `privacyidea_db`, user `privacyidea`
|
||||
|
||||
Store DB credentials as K8s Secrets (or ExternalSecrets from vault).
|
||||
Store DB credentials as K8s Secrets injected from Vault (T01 Phase 0b must
|
||||
be complete, or use placeholder K8s Secrets until Vault is live).
|
||||
Configure automated DB backups to object storage (S3 or MinIO).
|
||||
**Run a restore drill before proceeding** — a failed restore later is a
|
||||
critical blocker.
|
||||
@@ -216,10 +242,11 @@ COPY PrivacyIDEA-Provider.jar /opt/keycloak/providers/
|
||||
RUN /opt/keycloak/bin/kc.sh build
|
||||
```
|
||||
|
||||
Deploy via official Keycloak Operator (CRD-based) or codecentric KeycloakX
|
||||
Helm chart. Configure:
|
||||
Deploy via plain Helm chart (official Keycloak Operator CRD-based or
|
||||
codecentric KeycloakX Helm chart; **decision D3: plain Helm first, Flux
|
||||
later**). Configure:
|
||||
|
||||
- DB: `keycloak_db` (credentials from K8s Secret)
|
||||
- DB: `keycloak_db` (credentials from Vault / K8s Secret)
|
||||
- Ingress + TLS: `keycloak.yourdomain.com` (Traefik + cert-manager)
|
||||
- Hostname strictness + proxy mode (Traefik forward headers)
|
||||
- Metrics/logging (Prometheus annotations)
|
||||
@@ -242,8 +269,9 @@ priority: medium
|
||||
|
||||
In Keycloak:
|
||||
|
||||
1. Create/configure realm; set identity source of truth (Keycloak internal
|
||||
users recommended for initial deployment; LDAP/AD or Entra as extension).
|
||||
1. Create/configure realm. **Decision D2 applies:** identity source of truth
|
||||
is Keycloak-internal users. LDAP/AD and Entra federation is deferred to
|
||||
the enterprise tier (not in scope for this workplan phase).
|
||||
2. Create Authentication Flow "privacyIDEA Browser":
|
||||
- Add privacyIDEA execution step (REQUIRED)
|
||||
- Config: privacyIDEA URL = `https://pi.yourdomain.com`, service account
|
||||
@@ -272,9 +300,13 @@ status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Decide and implement identity source of truth (Keycloak internal →
|
||||
privacyIDEA Keycloak resolver, or LDAP/AD shared). The privacyIDEA 3.12+
|
||||
Keycloak user resolver simplifies alignment.
|
||||
**Decision D2 applies:** identity source of truth is Keycloak-internal with
|
||||
the privacyIDEA Keycloak resolver. Implement (not decide):
|
||||
|
||||
- Configure privacyIDEA 3.12+ Keycloak user resolver to align Keycloak
|
||||
users with privacyIDEA token ownership.
|
||||
- LDAP/Entra federation: explicitly out of scope for this phase; tracked as
|
||||
an enterprise-tier extension point.
|
||||
|
||||
Define policies in privacyIDEA:
|
||||
- Allowed token types: TOTP, hardware (YubiKey), passkey
|
||||
@@ -288,8 +320,17 @@ Configure auditing and log shipping: privacyIDEA audit logs + Keycloak
|
||||
events → centralized logging (ELK/Loki or equivalent). Token lifecycle
|
||||
policies: enrollment, revocation, re-enrollment on device loss.
|
||||
|
||||
**Bootstrap user management (D2 extension — scope TBD):**
|
||||
D2 also specifies a file-based lightweight user store for pre-Keycloak
|
||||
systems (dev/test/sandbox that do not connect to the cluster). Users stored
|
||||
as files in a secure subdirectory of the Linux home directory; auto-generates
|
||||
two test users with `N` / `+testN` username and email suffixes. Test users
|
||||
must not spill over into other systems; a mapping mechanism from sandbox
|
||||
identities to production should be provided. This scope is not yet captured
|
||||
in a task — see Open Questions.
|
||||
|
||||
**Done when:** policies documented and applied, self-service portal live,
|
||||
audit logs flowing.
|
||||
audit logs flowing, Keycloak resolver configured.
|
||||
|
||||
---
|
||||
|
||||
@@ -307,6 +348,7 @@ priority: medium
|
||||
backup to S3/MinIO). Test restore.
|
||||
- privacyIDEA encryption/audit key Secrets: encrypted export, versioned.
|
||||
- Keycloak realm exports: stored as JSON in git (GitOps-friendly).
|
||||
- Vault unseal keys and root token: offline copy in KeePassXC.
|
||||
|
||||
**Disaster recovery drill** (mandatory before production):
|
||||
1. Restore DB + keys into a fresh namespace.
|
||||
@@ -335,7 +377,9 @@ documented and tested, HSTS and NetworkPolicies verified.
|
||||
|
||||
## Deliverables Checklist
|
||||
|
||||
- [ ] Vault created; all secrets generated and encrypted ops bundle exported
|
||||
- [ ] KeePassXC vault created; all secrets generated and encrypted ops bundle exported
|
||||
- [ ] HashiCorp Vault deployed in-cluster; secrets migrated from KeePassXC
|
||||
- [ ] Secret injection strategy chosen and operational (ESO + Vault or Vault Agent)
|
||||
- [ ] `sso`, `mfa`, `databases` namespaces + NetworkPolicies deployed
|
||||
- [ ] TLS everywhere via cert-manager (Traefik ingress)
|
||||
- [ ] PostgreSQL live; both DBs created; backup + restore tested
|
||||
@@ -347,15 +391,37 @@ documented and tested, HSTS and NetworkPolicies verified.
|
||||
- [ ] Self-service portal live; token lifecycle policies defined
|
||||
- [ ] DR drill passed; monitoring live; break-glass documented and tested
|
||||
|
||||
## Open Questions / Extension Points
|
||||
## Open Questions
|
||||
|
||||
- **Vault backend**: KeePassXC (simple) vs HashiCorp Vault in-cluster
|
||||
(rotation, audit trail). Start with KeePassXC; upgrade to Vault when
|
||||
ThreePhoenix cluster is stable.
|
||||
- **Identity source of truth**: Keycloak-internal vs LDAP/AD/Entra.
|
||||
Decision needed before T07.
|
||||
- **GitOps tooling**: ArgoCD or Flux for declarative Helm management?
|
||||
Aligns with Railiance staged-promotion-lifecycle workstream.
|
||||
- **Cluster target**: Development on single-node k3s; production on
|
||||
ThreePhoenix (3-node HA). Workplan covers both; HA-specific steps noted
|
||||
where they diverge.
|
||||
The three original pending decisions (D1 vault backend, D2 identity source
|
||||
of truth, D3 GitOps tooling) have all been resolved. See `DECISIONS.md`.
|
||||
|
||||
Remaining open items:
|
||||
|
||||
1. **Secret injection strategy** — D1 resolves the vault backend (Vault
|
||||
in-cluster) but the concrete injection mechanism is still open: External
|
||||
Secrets Operator vs Vault Agent Injector. Should be decided and closed
|
||||
in T01 Phase 0b.
|
||||
|
||||
2. **File-based bootstrap user management (D2 extension)** — D2 specifies
|
||||
a lightweight file-based user store for pre-Keycloak environments. This
|
||||
is non-trivial scope (file format, test-user generation, isolation
|
||||
controls, production-mapping mechanism) and is not captured in any
|
||||
current task. Needs a decision: is this a task within this workplan, or
|
||||
a separate workplan/repo?
|
||||
|
||||
3. **AI-first / MCP layer (D3 extension)** — D3 establishes an AI-first
|
||||
development philosophy (TDD, API-first/headless, MCP layer, CLI
|
||||
tooling). This workplan currently covers only infrastructure deployment.
|
||||
Should Keycloak/privacyIDEA operations (user management, policy CRUD,
|
||||
token lifecycle) be wrapped in an MCP server or CLI? If so, this needs
|
||||
a new task or workplan.
|
||||
|
||||
4. **LDAP/Entra federation** — Explicitly deferred to the enterprise tier
|
||||
(D2). Track as an extension point when the time comes.
|
||||
|
||||
5. **Cluster target for dev/test** — D1 implies KeePassXC-based systems
|
||||
run independently of the cluster. The plan assumes single-node k3s for
|
||||
dev and ThreePhoenix for production. The sequencing between T01 Phase 0a
|
||||
(pre-cluster) and Phase 0b (in-cluster) should be confirmed once the
|
||||
Railiance cluster timeline is clearer.
|
||||
|
||||
Reference in New Issue
Block a user