# NetKingdom Runtime Architecture **Status:** draft (initial capture for NET-WP-0018-T02) **Date:** 2026-06-03 **Context:** Documents the *as-deployed* runtime after the first successful bootstrap (0015-0017) + T06-adjacent polish (0019). Not an idealized future architecture. Specific enough to guide a scratch rebuild or rehearsal without rediscovering integration details. Incorporates pragmatic audit paths and known UE integration points/gaps per the persisted assessment. This is the working system that survived the first bootstrap ceremony and is now the target for automation, validation, guide, and risk assessment in NET-WP-0018. ## Planes Model (From platform-identity-security-architecture.md baseline) - **Bootstrap plane**: Establishes initial trust before full platform services. Minimal authority for cluster access, initial identity/secret injection, break-glass recovery, transition to managed runtime. Owned by railiance-infra/cluster + net-kingdom credential bootstrap. Uses SOPS/age for at-rest + offline packets. - **Platform control plane**: Shared security services (identity, MFA, secrets, policy, audit, authorization). net-kingdom owns canonical architecture/IAM Profile/SSO/MFA/bootstrap decisions; deployed via Railiance stack. - **Tenant planes**: Workloads (Coulomb as tenant zero/reference). Must not alter platform root trust. Recursive trust rule: Normal tenant admin (even Coulomb) must never suffice to alter platform root of trust (IAM Profile semantics, break-glass, global MFA, OpenBao root/unseal, flex-auth policy pipelines, audit retention, etc.). ## Identity Stores, MFA Realms, and OIDC Flows **Lightweight mode (key-cape, current primary for bootstrap/internal):** - Directory: LLDAP (https://lldap.coulomb.social for admin; internal for Authelia). - SSO/Proxy: Authelia (over LLDAP). - MFA/Token: privacyIDEA (self-service enrollment for TOTP; pi-admin for setup/repair; used for assurance on privileged actions). - OIDC Provider: KeyCape (issuer https://kc.coulomb.social; conforms to NetKingdom IAM Profile v0.2). - KeyCape issues tokens with required claims: tenant, principal_type, groups, roles, scope/scp, assurance. - Registered clients include: netkingdom-bootstrap-console (for console OIDC login), openbao-admin (for OpenBao OIDC auth). - Redirects: http://localhost:8250/oidc/callback, http://127.0.0.1:8250/oidc/callback. - Groups/roles for bootstrap: net-kingdom-admins (for platform-admin OpenBao policy), net-kingdom-users (for scoped non-root). - platform-root / king credential: dedicated LLDAP user (separate from personal accounts like tegwick). Password in operator password safe; TOTP via privacyIDEA; roles include platform-root-custodian, openbao-admin, identity-admin. **Expanded mode:** Keycloak (for enterprise federation/SAML/Entra, complex realms, delegated admin). Not yet primary for bootstrap. **Capability progression (C1 lightweight -> C2 MFA/token):** - C1: Single-factor OIDC SSO over internal directory (key-cape: Authelia + LLDAP). - C2a (light 2FA): Authelia built-in TOTP/WebAuthn. - C2b (token authority): privacyIDEA for hardware tokens, many types, self-service, lifecycle. Applications target the IAM Profile v0.2 contract (`canon/standards/iam-profile_v0.2.md`), not concrete providers. **Token flows (high level):** - Human/service -> Authelia/LLDAP or Keycloak -> KeyCape/Keycloak issues IAM Profile token -> claims to flex-auth (for authz) or directly to protected services / OpenBao OIDC. - For bootstrap console: OIDC login verified to obtain platform-admin via KeyCape -> OpenBao. ## Authelia Handoff Authelia acts as the SSO proxy/authenticator in lightweight mode, fronting LLDAP directory + (where enabled) privacyIDEA MFA. Handoffs normalized identity to KeyCape for OIDC issuance. Used for day-to-day logins; email (e.g. bernd.worsch@gmail.com) is notification-only, not auth source for privileged/root. ## OpenBao OIDC Admin Path and Secrets/Credential Path **OpenBao as runtime secrets authority (post-bootstrap):** - KV v2 for platform config. - Dynamic DB creds, K8s auth/workload identity. - Future object storage STS brokering. - Audit devices, lease/revocation. - Delivery: direct clients, External Secrets Operator -> K8s Secrets, CSI mounts. - Auth: OIDC/JWT against KeyCape (maps claims/groups to policies, e.g. platform-admin for net-kingdom-admins group). - platform-root can obtain platform-admin policy via KeyCape/MFA (proven in 0015/0017). - Root token: revoked/dispositioned after init; used only for bootstrap/break-glass. Unseal keys in custody (age/SOPS protected, offline packets, king credential). **Bootstrap to runtime transition:** - SOPS/age for initial cluster secrets, emergency bundles, Git at-rest. - Once OpenBao alive + configured (auth, mounts, policies, audit): switch to it as long-lived authority. - Bootstrap-era creds/databases/access paths reviewed/rotated/cleaned before production reliance (see cleanup_complete, T03/T04 in 0017). **Platform root custody (see docs/platform-root-custody.md):** - Initial setup operator: tegwick / bernd.worsch@gmail.com (notification contact). - King credential: dedicated, rarely used platform-root identity (break-glass only). Not day-to-day Gitea/email account. - Temporary single-king custody (with MFA, encrypted offline, password-safe refs) allowed pre-prod; target two-of-three escrow. - Never store unseal/root/OTP/private keys in Git, State Hub, email, shell history, etc. ## Bootstrap UI / Console State (Control Surface) Implemented in `tools/security-bootstrap-console/security_bootstrap_console.py` (non-secret only; refuses live OpenBao init or secret collection). **Current stage (post 0017/0019):** S6 - Reopen under custody. **Key gates / posture (from metadata + console):** - King credential kit prepared. - Custody strategy approved (temporary-single-king). - OpenBao preflight, init ceremony (attended only), initial config, KeyCape client, OIDC auth, admin login via KeyCape/MFA, root token disposition (revoked), restore drill, cleanup/rotation, platform reopened. - audit_core_posture: bootstrap risk accepted (production sink not ready); owner, review date (2026-07-02), note recorded. See audit_core_posture_ready() / reason(). - Other: custodian age keys confirmed, mfa enrolled (TOTP via privacyIDEA), oidc_login_verified, no_secret_capture_confirmed, etc. - .local/security-bootstrap.json holds non-secret flags (updated via console approve/validate flows). **Available actions (status output + parser):** king-kit, custody-packet, openbao-preflight, handover-checklist, validate-* (t02, cleanup, lifecycle-flow, onboarding-dry-run), custody-roster-template, lifecycle-flow-template, lifecycle-guide, onboarding-dry-run-template, onboarding-dry-run (delegates to orchestrator), onboarding-dry-run-claims, lifecycle-cleanup-dryrun-users, validate-custody-roster, metadata-template, approve-custody-mode, web-ui, etc. **Web UI:** Served locally (default :8765 or similar); forms for custody approval, responsibility, audit_core flags (production_sink_ready, bootstrap_risk_accepted + owner/review/note), cleanup_complete, platform_reopened. Uses JS to compute gates from metadata. **Runbooks / payloads:** privacyIDEA realm repair, Key material compromised (taint), generate new unseal keys, emergency lock-down, restore drill, OpenBao token revocation, **User lifecycle dry-run (T06)** (from 0019: references dry-run-nonroot-user.sh, make security-bootstrap-onboarding-dry-run, console subcmds, NET-WP-0019). **0019 polish additions (T06-adjacent):** - dry-run-nonroot-user.sh orchestrator (/tmp workspace + EXIT trap cleanup; k8s fallback for LLDAP_ADMIN_PASS never writing persistent bootstrap/secrets for test users; create --test non-root; verifs (MFA, KeyCape); optional GraphQL lock/offboard; populate + validate evidence.json). - Console subcommands + make targets for repeatable dry-run, claims verification (infers from LLDAP groups + T01 role binding; warns on platform-root/admins), cleanup by pattern. - Evidence templates/validators for onboarding dry-run (12+ bools: effective access preview, no secret material recorded, actor_class != king, groups limited to net-kingdom-users, lldap/keycape verified, etc.). - Integrated into lifecycle-guide (T06 DRY-RUN section) and runbook_payloads for web-ui exposure. - Safer secret handling in create-user.sh (k8s extract fallback). **Evidence discipline:** /tmp/netkingdom-*-evidence.json (exact strings + bools); validated by console; non-secret only (refuses secret markers). ## State Hub Relation - Tracks domain netkingdom (topic a6c6e745-bf54-4465-9340-1534a2be493e). - Workstreams/tasks (e.g. this NET-WP-0018 id 800f9f16-..., 0019, 0017). - Progress events (POST /progress/ with workstream/task for what was done; used for tracking impl + feeding retrospectives). - Decisions (POST /decisions/ for key choices). - Inbox for cross-agent coordination. - .custodian-brief.md generated by fix-consistency (reflects file + DB). - Used for audit correlation in pragmatic layer (events link to actors/decisions). ## k8s / Deployment, DNS, Routes, Ingress, Trust Boundaries **Namespaces/components (from manifests + usage):** - sso: LLDAP, privacyIDEA, KeyCape (keycape-config Secret), Authelia? - openbao: OpenBao (0 pod; bao status via kubectl exec). - Railiance platform services (DBs, etc.) for stateful backing. **Ingress / DNS (internal .coulomb.social):** - LLDAP admin: lldap.coulomb.social - KeyCape: kc.coulomb.social (OIDC issuer) - Console OIDC callbacks: localhost/127.0.0.1:8250 - Other platform services via railiance-cluster ingress + cert-manager + NetworkPolicies. **Trust boundaries / token flows (high level):** - Bootstrap: local files (SOPS/age), operator password safe, k8s secrets (lldap-credentials etc.), direct kubectl for dry-run safety. - Runtime: OIDC tokens (IAM Profile claims) -> KeyCape/Keycloak -> flex-auth decisions or OpenBao OIDC (group->policy e.g. net-kingdom-admins -> platform-admin). - Workload identity: K8s auth to OpenBao for dynamic creds. - No tenant can reach platform-root paths without explicit platform control plane authority + custody safeguards. - Break-glass: king credential + unseal shares (custody roster, age protected). **Operational assumptions:** - Live k3s/Railiance cluster with ingress/cert-manager/NetworkPolicy. - Operator has kubectl access to sso/openbao ns + password safe for bootstrap material. - Non-secret evidence + State Hub for progress; secrets never in Git/console. - Capability-driven identity mode (lightweight key-cape sufficient for many cases). - Audit is currently pragmatic/separate during bootstrap (see below). ## Pragmatic Audit Paths (Current Bootstrap) Per Coordination Notes and assessment gap 7 (audit/event correlation): - **local-identity/audit.py**: Append-only TSV ~/.local-identity/audit.log (TSV: TS command username outcome; mode 600; silent on I/O failure). For local-identity CLI + OIDC server events during bootstrap phase. - **OpenBao audit**: Retained on audit PVC + Audit Core mock wiring only (production tenant-aware durable sink not ready; risk accepted with owner/review note). - **State Hub + console evidence**: /progress/ events (with workstream/task/decision correlation), /decisions/, non-secret evidence.json (from templates + live data + validators), metadata flags (audit_core_*), runbook payloads. Used for impl tracking and bootstrap ceremony records. - **Separate from UE**: Current bootstrap (direct LLDAP/Keycloak for platform-root/break-glass/test users in 0015-0019) does not yet route through user-engine AuditRecord/OutboxEvent or claims_enrichment. UE emits its own audit for domain facts when used. - **Contract requirement**: Must link request/actor/decision/user_engine_audit/outbox_event (audit correlation bundle in boundary contract). Current is pragmatic/separate for bootstrap. Documented here for rebuild guidance. Proper integration (adapters + sinks) is follow-up (see assessment recs + T03/T09). ## UE Integration Points and Known Gaps (from docs/user-engine-netkingdom-integration-assessment.md) **Fit (no intent conflicts):** - net-kingdom: IAM orchestration + authn/coarse claims (IAM Profile) + bootstrap + secrets (OpenBao) + contract governance + meta-orchestration of user-domain facts. - user-engine: headless user-domain backend (User/Account/ExternalIdentity/Membership with owning_system/source/freshness/delete_semantics, Application+Binding+Catalog, ProfileValue layered, EffectiveProfile+projections (incl. claims_enrichment, audit), ports (IdentityClaimsAdapter, AuthorizationCheckPort, SecretProvider, EventOutbox, AuditWriter), OutboxEvent/AuditRecord). In-mem MVP finished (USER-WPs); local standalone only currently. - Integration via claims/contracts/adapters (no shared code). UE consumes verified Actor from claims; delegates authz to PDP (flex-auth); exports projections. NK orchestrates boundaries ("owner wins" for membership sync, app onboarding 8-step bindings as separate records, etc.). **Current points in runtime:** - Bootstrap/T06 dry-runs + platform users use direct LLDAP/Keycloak (IAM side). - KeyCape OIDC claims (groups + email) feed OpenBao policy and console verification (0019 helper). - Claims enrichment not yet via UE projection + cache (direct LLDAP resolution in paths). - Memberships (net-kingdom-* groups) treated as IAM facts; not yet synced as UE Membership with semantics. - Audit separate (see above). **7 Gaps (biggest first; see full assessment for details/recs):** 1. Missing Platform Integration Adapters (UE side or symmetric): IdentityClaimsAdapter (KeyCape claims -> Actor), AuthorizationCheckPort (to flex-auth), SecretProvider (OpenBao), EventOutbox, AuditWriter, etc. Only mocks today. Blocks UE as canonical for user facts. 2. Bootstrap/Platform Users vs. Governed UE Lifecycle (direct LLDAP creates for root/break-glass/"net-kingdom-*" vs. UE Membership + externally_provisioned). 3. Application Onboarding "Application" concept (KeyCape OIDC client/secret vs. UE Application + Binding records; must stay separate). 4. Membership/Group Overlap (LLDAP groups vs. UE Membership scopes + owner/source). 5. Governance/Workplan/Brief Split (UE brief stale May22/domain=netkingdom; 0018/0019 as NK orchestration correct but line must stay crisp). 6. Claims Enrichment Path drift (current direct LLDAP in OIDC/bootstrap paths; must switch to adapter-owned when UE deployed; UE never in token critical path). 7. Other: Audit/event correlation (shared IDs across UE/flex/platform; current bootstrap separate/split to audit-core); tenant platform:root special case; no prod UE deployment in NK flows yet. **Recommendations (for 0018 context):** Use T02 to document current paths + gaps; T07/T08 as testbed for integration once adapters exist (e.g. update dry-run to exercise UE); T03/T09 to classify UE integration risk. NK keeps boundary/contracts/orchestration; UE owns domain impl. See canon/standards/user-engine-boundary-contract_v0.1.md, docs/user-engine-netkingdom-integration-assessment.md, responsibility-map.md, SCOPE.md, user-engine/INTENT.md + SCOPE.md for full. ## Dry-Run / User Lifecycle Tooling (0019 Polish Additions) See NET-WP-0019 and sso-mfa/k8s/lldap/dry-run-nonroot-user.sh: - Safe repeatable non-root onboarding dry-run (create/verify/lock/offboard) with /tmp hygiene, k8s secret fallback, evidence (lldap_identity_verified, keycape_oidc_claims_verified, effective_access_summary, no_secret_material_recorded, actor_class, groups, lock_offboard_result, 12+ bools), validate via make/console. - Console: onboarding-dry-run*, claims verification (T05 helper), lifecycle-cleanup-dryrun-users. - Make targets + runbook entry for web-ui. - Integrated into lifecycle-guide. - Proves IAM-lifecycle contract; foundation for future UE-backed version (per assessment). ## Operational Assumptions and Rebuild Notes - Requires live Railiance/k3s cluster with required addons (ingress, cert-manager, etc.). - Operator kubectl to target ns + access to password safe for bootstrap material + age keys. - All privileged flows show effective preview; MFA for privileged; no platform-root grants to non-king. - Evidence always non-secret; secrets in safe/k8s only. - For scratch rebuild: follow T05 guide (once complete) + evidence per step + T09 risk assessment. Use 0019 dry-run tooling as model for safe user lifecycle tests. Rehearse in isolated/namespace/scripted first (non-goal: destructive live). - Audit currently pragmatic for bootstrap (documented above); production correlation is follow-up. **References:** - platform-identity-security-architecture.md, responsibility-map.md, SCOPE.md - docs/user-engine-netkingdom-integration-assessment.md + canon/standards/user-engine-boundary-contract_v0.1.md - security-bootstrap-*.md family (operator-journey, openbao-ceremony-ux, user-lifecycle, handover-cleanup, king-credential-kit, age-custody, etc.) - tools/security-bootstrap-console/security_bootstrap_console.py (and Makefile targets) - sso-mfa/k8s/lldap/{create-user.sh,dry-run-nonroot-user.sh}, keycape/verify-*, privacyidea/check-* - local-identity/ (audit.py, etc.) - .local/security-bootstrap.json (current gates) - NET-WP-0017, 0019 workplans + their evidence - DECISIONS.md, ADRs (e.g. 0007, 0010), canon/standards/iam-profile_v0.2.md This document will be updated as T03 retrospective, T05 guide, T06/T08 work, and T09 risk assessment proceed. It is the single source for "what the running system actually is" for rebuild guidance.