# Platform Identity and Security Architecture Status: draft architecture baseline for NetKingdom/Railiance/Coulomb Date: 2026-05-17 ## Purpose This document captures the production-oriented identity, authorization, MFA, credential, and bootstrap architecture for the platform we are building. It deliberately treats Coulomb as the first internal tenant and reference workload, not as the platform itself. The architecture must be recursive: the same platform that protects future tenants also protects the services and repositories used to build and operate the platform. That recursion is useful, but it is also where many security designs accidentally collapse into self-administering root power. This document exists to prevent that. ## Core Model ```text Bootstrap plane establishes initial trust before normal platform services exist Platform control plane operates identity, MFA, secrets, policy, audit, and authorization Tenant planes run Coulomb and future customer/project/domain workloads ``` Coulomb is the first internal tenant. It is also the reference tenant that helps validate the platform. It must not become the platform root of trust merely because it is first. ## Planes ### Bootstrap Plane The bootstrap plane exists before the full platform is alive. It owns the minimal authority needed to create and recover the control plane. Responsibilities: - host provisioning and hardening - root age/SOPS material and emergency bundles - initial cluster access - initial identity service deployment - initial secret injection - break-glass recovery - transition to managed runtime authority Owned primarily by `railiance-infra`, `railiance-cluster`, and the credential bootstrap work in `net-kingdom`. ### Platform Control Plane The platform control plane owns shared security services. Responsibilities: - NetKingdom IAM Profile - lightweight identity mode through key-cape - expanded identity mode through Keycloak - MFA/token lifecycle through privacyIDEA where applicable - canonical authorization through flex-auth - delegated authorization runtime through Topaz first, with other PDPs as adapters - audit and explanation records - platform service secrets and rotation Owned conceptually by `net-kingdom`; deployed through the Railiance stack. ### Tenant Plane Tenant planes are where workloads live. Coulomb is tenant zero/reference tenant; later tenants may be projects, customers, domains, sandboxes, or isolated deployments. Responsibilities: - protected services and repositories - tenant-owned resources - tenant-specific groups, policies, and service accounts - local enforcement of authorization decisions - workload audit events and diagnostics Tenant administrators may manage their tenant resources. They must not be able to alter platform root trust, global identity configuration, platform break-glass material, or the policy pipeline that governs the platform itself. ## Component Responsibilities | Component | Primary role | Must not become | | --- | --- | --- | | `net-kingdom` | canonical security architecture, IAM Profile, SSO/MFA, credential bootstrap decisions | a deployment repo for every stack layer | | `key-cape` | lightweight IAM implementation of the NetKingdom IAM Profile | a general-purpose IAM platform or authorization engine | | Keycloak | expanded-mode IAM and optional Keycloak Authorization Services adapter | the canonical model for all platform authorization | | privacyIDEA | MFA/token authority, especially in lightweight/key-cape mode | a policy decision point for application resources | | `flex-auth` | authorization control plane, CARING descriptors, policy packages, decision envelopes, audit/explain | an identity provider or backend-specific wrapper | | Topaz | first delegated authorization runtime/PDP for flex-auth | the platform control plane or identity provider | | Railiance repos | converged infrastructure, cluster, platform services, enablement, and app deployment | the source of security policy semantics | ## Identity Path ```text Human/service/agent principal | v NetKingdom IAM Profile | +-- lightweight mode: key-cape | Authelia + LLDAP + privacyIDEA | +-- expanded mode: Keycloak Keycloak + LDAP/Entra federation + MFA integration ``` Applications depend on the IAM Profile, not on the concrete provider. key-cape is the lightweight profile implementation. Keycloak is the expanded-mode profile implementation. privacyIDEA provides MFA/token capabilities where the deployment mode uses it. Identity answers: who is this actor, how was the actor authenticated, what coarse claims are asserted, and what assurance evidence exists? Identity does not answer final resource-specific authorization. ## Authorization Path ```text Identity claims from IAM Profile | v flex-auth resource registry policy packages CARING descriptors decision/audit/explain envelope | +-- standalone evaluator +-- Topaz delegated PDP +-- optional Keycloak AuthZ adapter +-- future OpenFGA/SpiceDB/OPA/Cedar adapters | v Protected service enforcement ``` Authorization answers: may this actor perform this action on this resource in this context, and what explanation/audit/CARING metadata supports that answer? Protected services enforce decisions locally. flex-auth is the canonical policy and decision boundary; delegated PDPs are runtime implementations behind it. ## Recursive Trust Rule Normal tenant administration must never be sufficient to alter the platform root of trust. This applies even when the tenant is Coulomb. Coulomb can be a tenant and a reference workload, but platform-root actions require platform control plane authority and appropriate bootstrap/break-glass safeguards. Examples of platform-root actions: - changing IAM Profile semantics - rotating root bootstrap keys - changing break-glass access - changing global MFA requirements - activating authorization policy that governs platform administration - changing flex-auth/Topaz policy import pipelines - changing audit retention or tamper-evidence settings ## Tenant Model Every protected resource should belong to a tenant or to the platform control plane. Suggested identifiers: ```text tenant:platform # platform control plane resources tenant:coulomb # first internal/reference tenant tenant:sandbox: # sandbox tenants tenant:customer: # future customer tenants ``` Tenant membership and platform membership are distinct. A subject may be an administrator in `tenant:coulomb` without being a platform operator. CARING descriptors should explicitly identify scope and tenant when the access is tenant-scoped. Platform-scoped descriptors should be rare, audited, and usually condition-bound. ## Bootstrap To Runtime Transition Production setup should move through explicit trust states: 1. **Bare host trust** - provisioned and verified by Railiance infra. 2. **Cluster trust** - Kubernetes runtime exists and is verified. 3. **Secret trust** - age/SOPS and emergency bundles are established. 4. **Bootstrap identity trust** - local/bootstrap identity can operate enough to install full identity services. 5. **Runtime identity trust** - key-cape or Keycloak becomes the normal IAM Profile issuer. 6. **Runtime authorization trust** - flex-auth and Topaz are initialized with platform and tenant policies. 7. **Tenant onboarding trust** - Coulomb and later tenants register resources and receive scoped authority. Each transition needs a verification check and a rollback/recovery path. ## Production Topology For an initial production-capable Coulomb deployment: ```text railiance-infra host baseline, SSH, age keys, emergency material railiance-cluster Kubernetes, ingress, cert-manager, network policy railiance-platform PostgreSQL, object storage, secret management key-cape or Keycloak privacyIDEA where used flex-auth Topaz railiance-apps Coulomb services as tenant:coulomb workloads ``` `net-kingdom` owns the architecture and standards. Railiance owns the converged deployment layers. Component repos own their implementation contracts. ## Orchestration Implication A future orchestration repo may be justified, but only after the state machine is clear. It should not own resources directly. It should own safe sequencing across repos. Possible responsibilities: - verify Railiance preconditions - initialize credential bootstrap - deploy or validate identity services - deploy or validate flex-auth and Topaz - run IAM Profile conformance checks - run authorization conformance checks - produce a platform security readiness report This orchestration layer should build on Railiance capabilities rather than bypassing the Railiance stack boundaries. ## Open Questions - Where is the durable audit log stored for platform-root decisions? - Which actions require dual control or human confirmation? - How is break-glass use recorded when normal identity is unavailable? - Which tenant metadata is required before a service can register resources with flex-auth? - When does the platform switch from key-cape lightweight mode to Keycloak expanded mode? - Does Topaz run centrally for the platform, per tenant, or per service for the first production deployment?