NK-WP-0001-T04 (privacyIDEA, Keycloak path) -> cancelled, superseded by NK-WP-0003-T04 in the deployed KeyCape stack. T05-T08 (Keycloak SSO, realm/MFA flow, user mgmt, DR) -> cancelled and migrated to NK-WP-0011. NK-WP-0011 reframes the deferred Keycloak work as expanded-mode enterprise federation: Keycloak as an identity broker for Entra ID / AD / SAML that issues IAM Profile-conformant tokens, refined against the current stack (OpenBao runtime secrets, CloudNativePG, flex-auth/Topaz PDP, recursive platform/tenant model) rather than the original greenfield assumptions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
10 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id, depends_on, supersedes_tasks
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | state_hub_workstream_id | depends_on | supersedes_tasks | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NK-WP-0011 | workplan | Enterprise Federation & SAML — Expanded-Mode Keycloak Identity Broker | netkingdom | net-kingdom | proposed | worsch | netkingdom | 2026-05-20 | 2026-05-20 | TBD |
|
|
NK-WP-0011 — Enterprise Federation & SAML (Expanded-Mode Keycloak)
Extracted from NK-WP-0001 (T05–T08, the deferred Keycloak path) and refined against where net-kingdom actually stands today: a deployed KeyCape lightweight stack, an OpenBao runtime-secret authority, and a recursive platform/tenant authorization model. This is expanded identity mode in the architecture (
docs/platform-identity-security-architecture.md).
Goal
Stand up Keycloak as an identity broker that federates upstream enterprise identity providers (Microsoft Entra ID / Azure AD via OIDC, on-prem Active Directory via LDAP, and generic SAML 2.0 IdPs) and issues NetKingdom IAM Profile-conformant tokens downstream — without displacing flex-auth as the authorization decision point or breaking the recursive platform/tenant boundary.
This is the answer to the long-standing open question "when does the platform switch from key-cape lightweight mode to Keycloak expanded mode?" — expanded mode exists specifically to onboard identities that originate in an external enterprise IdP, which the lightweight Authelia + LLDAP stack cannot broker.
Why this is not just "resume NK-WP-0001"
NK-WP-0001 assumed a greenfield: bootstrap Vault, build PostgreSQL, treat Keycloak as the internal user store. None of those assumptions hold now:
| NK-WP-0001 assumption | Current reality | Effect on this plan |
|---|---|---|
| HashiCorp Vault, bootstrapped from KeePassXC | OpenBao is the runtime secret authority (NK-WP-0006); SOPS/age + agent bootstrap exist (NK-WP-0004/0005) | Keycloak DB + admin secrets come from OpenBao via ESO; no new vault bootstrap |
| PostgreSQL built from scratch | CloudNativePG running on RAILIANCE01 (NK-WP-0003) | Add keycloak_db to the existing operator, reuse backup pattern |
| Keycloak is the internal source of truth (D2 hybrid) | KeyCape lightweight stack is the deployed IAM Profile issuer | Keycloak is a broker/federation front-end, not the primary user store |
| Authorization via Keycloak Authorization Services | flex-auth + Topaz is the canonical PDP (ADR-0006) | Keycloak AuthZ Services is at most an optional adapter, never canonical |
| Single-tenant Coulomb deployment | Recursive tenant:platform vs tenant:coulomb model (NK-WP-0006) |
Realm-per-tenant; tenant admins must not receive platform-root |
| MFA solely via privacyIDEA provider JAR | privacyIDEA deployed and upstream IdPs carry their own MFA | MFA assurance source becomes a decision, not a default |
Architecture
Enterprise IdPs (upstream)
Entra ID (OIDC) AD (LDAP) SAML 2.0 IdP
│ │ │
└──────────────┼──────────────┘
▼
[ Keycloak ] expanded-mode broker
│ realm-per-tenant; IAM Profile issuer
│ secrets ← OpenBao (ESO)
│ MFA ← privacyIDEA *or* upstream assurance
▼
NetKingdom IAM Profile token (OIDC/PKCE)
│
├──► applications (depend on the Profile, not the provider)
└──► flex-auth / Topaz ── authorization decision (PDP)
coexists with: KeyCape lightweight issuer (id.coulomb.social)
Keycloak answers identity (who, how authenticated, coarse claims, assurance). It does not answer resource authorization — that stays in flex-auth (ADR-0006). It does not store runtime secrets — those stay in OpenBao.
Scope
In scope:
- decision record for expanded-mode adoption: trigger, federation topology (broker vs SAML SP), realm-per-tenant model, and coexistence with the KeyCape lightweight issuer
- custom Keycloak image (privacyIDEA provider JAR if MFA is delegated to privacyIDEA) and Helm deployment on RAILIANCE01
- upstream federation: Entra ID (OIDC), AD (LDAP), generic SAML 2.0 IdP
- claim mapping to the NetKingdom IAM Profile (issuer, audience, subject, groups, tenant, assurance evidence) and IAM Profile conformance checks
- MFA / assurance source decision and enforcement of step-up for privileged actions
- recursive tenancy: realm-per-tenant, platform-root guardrails, and the flex-auth/Topaz authorization boundary
- backups, DR, break-glass, monitoring, and audit shipping for the broker
Out of scope:
- replacing flex-auth/Topaz with Keycloak Authorization Services
- migrating the deployed lightweight stack off KeyCape (coexistence only)
- application-side OIDC client code (apps target the IAM Profile spec)
- deploying OpenBao itself (Railiance platform) — consumed, not built
- tenant-specific federation policy for tenants beyond
tenant:platformandtenant:coulomb
Tasks
id: NK-WP-0011-T1
status: todo
priority: high
Decision record — expanded-mode adoption & federation topology. Write
an ADR (ADR-0009) capturing: the concrete trigger for switching a tenant
from lightweight to expanded mode; whether Keycloak acts as an OIDC
identity broker, a SAML service provider, or both; the realm-per-tenant
mapping onto tenant:platform / tenant:coulomb; how the Keycloak issuer
coexists with the KeyCape issuer (id.coulomb.social) so applications
still target one IAM Profile contract; and the canonical hostname/issuer
for the broker. Resolve or supersede D2 from NK-WP-0001.
id: NK-WP-0011-T2
status: todo
priority: high
PostgreSQL keycloak_db on the existing operator. Add a keycloak
database and role to the CloudNativePG instance from NK-WP-0003 (do not
deploy a new database). Source credentials from OpenBao via ESO into a K8s
Secret. Confirm the existing backup schedule covers the new database and
run a restore drill for keycloak_db specifically.
id: NK-WP-0011-T3
status: todo
priority: high
Deploy expanded-mode Keycloak. Build a custom image
(kc.sh build, privacyIDEA provider JAR included only if T5 delegates MFA
to privacyIDEA). Deploy via plain Helm on RAILIANCE01 behind Traefik +
cert-manager at the issuer hostname from T1. Admin bootstrap secret and DB
secret come from OpenBao/ESO — never typed, never in git. Hostname
strictness + proxy headers configured for Traefik. Realm import is
GitOps-friendly (realm JSON/CR in git).
id: NK-WP-0011-T4
status: todo
priority: high
Upstream federation. Configure identity brokering for Entra ID (OIDC), on-prem AD (LDAP user federation), and a generic SAML 2.0 IdP. Map each source's claims/attributes into the NetKingdom IAM Profile shape: issuer, audience, subject, groups, tenant, and assurance evidence. Define the attribute/claim mappers and group→role mapping. Verify a federated login end-to-end for at least the Entra ID path.
id: NK-WP-0011-T5
status: todo
priority: medium
MFA / assurance source. Decide and implement the assurance model: MFA enforced by privacyIDEA (via the Keycloak provider JAR + a "privacyIDEA Browser" flow, carried over from NK-WP-0001) versus trusting upstream IdP MFA (e.g. Entra Conditional Access) and reflecting it as assurance evidence in the token. Require step-up for admin console and platform-root-sensitive clients. Ensure assurance evidence is carried in the IAM Profile token so flex-auth can gate privileged actions on it.
id: NK-WP-0011-T6
status: todo
priority: high
IAM Profile conformance & downstream coexistence. Run IAM Profile conformance checks against the Keycloak issuer (discovery document, PKCE, token/claim shape, JWKS, userinfo). Verify an application configured for the IAM Profile can authenticate against either the KeyCape or the Keycloak issuer per the T1 selection rule. Document per-tenant issuer selection.
id: NK-WP-0011-T7
status: todo
priority: high
Recursive tenancy & authorization boundary. Implement realm-per-tenant with platform-root guardrails: tenant admins manage only their realm and must not be able to alter IAM Profile semantics, the platform realm, federation trust, OpenBao platform mounts, or audit retention (per the flex-auth/Topaz implications in the architecture doc). Confirm flex-auth + Topaz remains the PDP; if a Keycloak Authorization Services adapter is used at all, document it as a delegated, non-canonical adapter.
id: NK-WP-0011-T8
status: todo
priority: medium
Backups, DR, break-glass, monitoring, audit. Realm exports to git; DB backup + restore drill (T2); break-glass admin path disabled-by-default with alerting on use; Prometheus/Grafana for auth success/failure, MFA latency, federation errors. Ship Keycloak events to the durable platform audit sink alongside flex-auth/Topaz/OpenBao records, with correlation ids — satisfying the "Audit sink" and "Break-glass" rows of the production-readiness checklist.
Acceptance Criteria
- An ADR records the expanded-mode trigger, federation topology, realm-per-tenant model, and KeyCape/Keycloak issuer coexistence.
- A federated user from at least one enterprise IdP (Entra ID) can log in and receive an IAM Profile-conformant token with tenant + assurance claims.
- Keycloak secrets originate from OpenBao; none are bootstrapped from KeePassXC or committed to git.
- flex-auth + Topaz remains the authorization decision point; Keycloak is not the canonical policy engine.
- Tenant admins cannot cross the platform-root boundary.
- Keycloak audit events land in the durable platform audit sink with correlation ids, and a DR/break-glass drill has passed.
Open Questions / Dependencies on Other Repos
- key-cape: does coexistence require KeyCape changes, or can both issuers serve the same IAM Profile unchanged? (EP-NK-001 federation extension point.)
- flex-auth: confirmed claim/decision-envelope contract for tenant + assurance evidence sourced from a federated token.
- railiance-platform: OpenBao must expose a Keycloak auth role / ESO path before T3; unseal/break-glass story must be ready.
- IAM Profile spec: must be versioned and have an executable conformance check before T6 can pass (see "Missing" below).