Files
net-kingdom/sso-mfa/WORKPLAN.md
Bernd Worsch 2bbe328aec docs(sso-mfa): record T04 blocker — wrong image reference (ImagePullBackOff)
privacyidea/privacyidea:3.12 does not exist on Docker Hub. Pod is deployed
but stuck. Correct image reference must be identified before proceeding.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 17:16:35 +00:00

164 lines
8.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# SSO-MFA Platform — Stack Migration Workplan
# NK-WP-0001 — Keycloak → Authelia + LLDAP + KeyCape
**Updated:** 2026-03-20 (T04 BLOCKED — ImagePullBackOff; T05T08 pending T04)
**Workstream:** sso-mfa-platform (39263c4b-ef70-4053-b782-350834b7e1be)
## Stack Decision
Keycloak + privacyIDEA replaced by:
- **LLDAP** — lightweight LDAP directory (user store)
- **Authelia** — authentication frontend (password auth + OIDC upstream)
- **KeyCape** — OIDC orchestration layer (auth code flow + MFA via privacyIDEA adapter)
- **privacyIDEA** — MFA engine (unchanged, still in `mfa` namespace)
Hostnames: kc.coulomb.social (KeyCape), auth.coulomb.social (Authelia), lldap.coulomb.social (LLDAP admin)
## Task Status
| Task | ID (hub) | Status | Notes |
|------|----------|--------|-------|
| T01 — Vault & secret bootstrap | 7992528c | done | |
| T02 — K8s foundations | 721ca6b2 | done | Manifests authored; pending live cluster |
| T03 — PostgreSQL | 7fa60004 | done | Manifests authored; pending live cluster |
| T04 — privacyIDEA | 6ad1296a | **BLOCKED** | Pod deployed, ImagePullBackOff — image privacyidea/privacyidea:3.12 does not exist; fix image ref first |
| T05 — SSO core (new stack) | b9f73aa6 | done | commit 0754dc3 |
| T06 — Realm config & MFA flow | 3b6379a4 | **in-progress** | See below |
| T07 — User mgmt & self-service | c7cf902a | **in-progress** | See below |
| T08 — Backups, DR, break-glass | 9cbd1d89 | **in-progress** | See below |
## T04 — privacyIDEA
### Deliverables (already authored)
- [x] `k8s/privacyidea/pvc.yaml` — privacyidea-data and privacyidea-logs PVCs
- [x] `k8s/privacyidea/configmap.yaml` — pi.cfg template (secrets injected at runtime)
- [x] `k8s/privacyidea/create-secrets.sh` — privacyidea-config Secret
- [x] `k8s/privacyidea/deployment.yaml` — Deployment + Service (port 8080)
- [x] `k8s/privacyidea/middleware.yaml` — rate-limit + admin IP allowlist (ipWhiteList, Traefik v2)
- [x] `k8s/privacyidea/ingress.yaml` — pink.coulomb.social + pink-account.coulomb.social
- [x] `k8s/privacyidea/enckey-bootstrap.sh` — extract enckey + audit keys post-start
- [x] `k8s/privacyidea/bootstrap-admin.sh` — create pi-admin + trigger-admin
- [x] `k8s/verify-t04.sh` — verify pod, service, middlewares, ingresses, TLS, secrets, PVCs
### BLOCKER — wrong image (2026-03-20)
- Pod `privacyidea-8b4b5f567-wf858` is deployed in `mfa` namespace but stuck in `ImagePullBackOff`
- Image `privacyidea/privacyidea:3.12` does not exist on Docker Hub
- **Intermediate step needed:** identify correct image reference, then patch `deployment.yaml`
- Candidates: `ghcr.io/privacyidea/privacyidea-apache2:<tag>` or similar
- Port may differ (manifest assumes 8080 — verify against actual image)
### Pending (needs live cluster)
- [ ] Fix image in `deployment.yaml` — confirm correct registry/tag
- [ ] `./create-secrets.sh` — create privacyidea-config Secret in mfa namespace (may already exist; check first)
- [ ] `kubectl apply -f pvc.yaml configmap.yaml middleware.yaml deployment.yaml ingress.yaml`
- [ ] Wait for pod Running/Ready (up to 3 min — DB migrations run on first boot)
- [ ] `./enckey-bootstrap.sh` — extract enckey+auditkeys, store in KeePassXC, create DR Secrets
- [ ] `./bootstrap-admin.sh` — create pi-admin and trigger-admin
- [ ] Log in to pink.coulomb.social, enroll TOTP for pi-admin, verify MFA challenge
- [ ] Run `../verify-t04.sh` — 0 FAILs
- [ ] Commit and mark T04 done
### Done-criteria for T04
- privacyIDEA pod Running+Ready in mfa namespace
- pink.coulomb.social and pink-account.coulomb.social reachable with valid TLS
- pi-admin and trigger-admin accounts exist
- pi-admin has enrolled a TOTP token and MFA challenge fires on login
- privacyidea-enckey and privacyidea-auditkeys Secrets exist (DR copies)
- verify-t04.sh: 0 FAILs
## T05 — SSO Core (new stack: LLDAP + Authelia + KeyCape)
### Done
- [x] LLDAP manifests: pvc.yaml, deployment.yaml, middleware.yaml, ingress.yaml, create-secrets.sh
- [x] Authelia manifests: pvc.yaml, configmap.yaml, deployment.yaml, ingress.yaml, create-secrets.sh
- [x] KeyCape manifests: deployment.yaml, middleware.yaml, ingress.yaml, create-secrets.sh
- [x] NetworkPolicy: netpol-sso.yaml updated for new components
- [x] Keycloak manifests staged for deletion
### In Progress (this session)
- [x] keycape/create-pi-token.sh
- [x] lldap/README.md
- [x] authelia/README.md
- [x] keycape/README.md
- [x] Update CONFIG.md (fixed CP-NK-004, removed old CP-NK-005, added CP-NK-005 auth.*, CP-NK-006 lldap.*)
- [x] Update bootstrap/gen-secrets.sh (removed Keycloak, added LLDAP/Authelia/KeyCape sections)
- [x] Update k8s/README.md (network policy table)
- [x] Replace verify-t05.sh (Keycloak → LLDAP+Authelia+KeyCape checks)
- [x] Commit all changes — commit 0754dc3
- [x] Update state hub tasks — T05 marked done, milestone event logged
### Done-criteria for T05
- All manifests present and consistent
- gen-secrets.sh generates correct secrets for new stack
- verify-t05.sh checks all three components
- Committed to main
## T06 — Realm config & MFA flow (KeyCape → privacyIDEA)
### Deliverables
- [x] `k8s/privacyidea/bootstrap-realm.sh` — creates LLDAP resolver, "netkingdom" realm, enrollment + passthru policies
- [x] `k8s/verify-t06.sh` — verifies realm, resolver, KeyCape→PI token, connectivity
### In Progress (this session)
- [ ] Run `bootstrap-realm.sh` on live cluster (requires T04 applied)
- [ ] Run `keycape/create-pi-token.sh` then `keycape/create-secrets.sh` (inject real PI token)
- [ ] Restart KeyCape with updated keycape-config
- [ ] Enroll a TOTP token for pi-admin via pink-account.coulomb.social
- [ ] Test end-to-end login via kc.coulomb.social
- [ ] Run `verify-t06.sh` — all checks pass
- [ ] Commit and mark T06 done
### Done-criteria for T06
- privacyIDEA "netkingdom" realm exists with LLDAP resolver
- LDAP resolver resolves users from LLDAP
- keycape-pi-token contains a real (non-placeholder) JWT
- KeyCape→privacyIDEA token list API returns status=True
- At least one user has enrolled a TOTP token
- verify-t06.sh: 0 FAILs
## T07 — User mgmt & self-service
### Deliverables
- [x] `k8s/lldap/bootstrap-users.sh` — creates net-kingdom-users and net-kingdom-admins groups in LLDAP via GraphQL API
- [x] `k8s/lldap/break-glass.sh` — creates the break-glass bypass account and assigns to net-kingdom-admins
- [x] `k8s/verify-t07.sh` — verifies groups, break-glass user, self-service portal, OIDC client registrations
### Pending (needs live cluster)
- [ ] Run `lldap/bootstrap-users.sh` to create groups
- [ ] Run `lldap/break-glass.sh` to create break-glass account
- [ ] Add first real user via LLDAP WebUI (lldap.coulomb.social)
- [ ] Register first OIDC client in `keycape/create-secrets.sh` (clients: block)
- [ ] User self-enrolls TOTP at pink-account.coulomb.social
- [ ] Run `verify-t07.sh` — 0 FAILs
### Done-criteria for T07
- Groups net-kingdom-users and net-kingdom-admins exist in LLDAP
- break-glass user exists and is in net-kingdom-admins
- At least one regular user exists
- At least one OIDC client registered in KeyCape
- verify-t07.sh: 0 FAILs
## T08 — Backups, DR, break-glass
### Deliverables
- [x] `k8s/backup/cronjob-sqlite-backups.yaml` — daily SQLite backup CronJobs for LLDAP, Authelia, privacyIDEA; RBAC for Authelia scale-down/up
- [x] `k8s/backup/DR-RUNBOOK.md` — full restore runbook: scenarios, restore order, node rebuild procedure, offsite export
- [x] `k8s/verify-t08.sh` — verifies CronJobs, RBAC, backup files on PVCs, DR runbook presence
### Pending (needs live cluster)
- [ ] Apply `backup/cronjob-sqlite-backups.yaml`
- [ ] Trigger each CronJob manually once to verify they run clean:
`kubectl create job -n sso --from=cronjob/lldap-backup lldap-backup-test`
`kubectl create job -n sso --from=cronjob/authelia-backup authelia-backup-test`
`kubectl create job -n mfa --from=cronjob/privacyidea-backup pi-backup-test`
- [ ] Confirm backup files appear on PVCs
- [ ] Run offsite export: pull backup files, encrypt with age, store offsite
- [ ] Run `verify-t08.sh` — 0 FAILs
### Done-criteria for T08
- All three backup CronJobs deployed and have ≥1 successful run
- Backup files confirmed on PVCs
- DR-RUNBOOK.md reviewed by operator
- Offsite ops bundle current (pack-bundle.sh run after all secrets finalised)
- verify-t08.sh: 0 FAILs