Files
net-kingdom/workplans/NET-WP-0015-platform-root-custody-and-openbao-identity-bootstrap.md

25 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on, state_hub_workstream_id
id type title domain repo status owner topic_slug created updated depends_on state_hub_workstream_id
NET-WP-0015 workplan King Credential And OpenBao Identity Bootstrap netkingdom net-kingdom finished codex netkingdom 2026-05-24 2026-06-01
NK-WP-0006
NK-WP-0012
6b9c25e4-1008-429a-8de6-54361872c0dd

NET-WP-0015 - King Credential And OpenBao Identity Bootstrap

Goal

Define and execute the first safe bridge between low-trust setup operations, a dedicated king credential, NetKingdom identity, and Railiance OpenBao bootstrap.

The revised decision is that tegwick / bernd.worsch@gmail.com is the initial accountable setup operator and notification contact, not the long-term platform root of trust. The actual platform-root authority should move to a separate king credential before OpenBao becomes live secret custody.

Context

Railiance owns OpenBao deployment and operations. NetKingdom owns the identity, custody, and security semantics that say who can administer the platform and how that authority transitions from bootstrap material into normal IAM claims.

The platform is still in MVP/prototype bootstrap. That means early databases, admin accounts, tokens, and access paths must be treated as potentially contaminated by convenience. The platform should be assembled in low-trust mode, then handed over to the king credential, reset/rotated, checked, and reopened under explicit custody.

Scope

In scope:

  • record the setup operator/contact identity;
  • define the separate king credential target;
  • define the temporary single-operator king custody exception;
  • specify target NetKingdom IAM claims for the first admin identity;
  • coordinate the OpenBao initialization prerequisites with Railiance;
  • define the transition from OpenBao root token to scoped admin access; and
  • add follow-up gates for independent escrow, OIDC/JWT admin auth, reset/rotation, scan checks, and restore verification.

Out of scope:

  • storing any secret material in this repo;
  • running bao operator init from an unattended agent session;
  • deploying key-cape, Keycloak, privacyIDEA, or OpenBao itself; and
  • granting tenant administrators platform-root authority.

Tasks

T01 - Record Setup Operator And King Credential Model

id: NET-WP-0015-T01
status: done
priority: high
state_hub_task_id: "60659e25-fed1-478e-b8a3-4bc7b2f3846b"

Record tegwick / bernd.worsch@gmail.com / Gitea tegwick as the initial setup operator and contact. Define the separate king credential as the actual platform-root target.

2026-05-24: Added docs/platform-root-custody.md and updated docs/platform-identity-security-architecture.md plus SCOPE.md.

2026-05-24: Revised the custody model: tegwick is no longer modeled as the platform root of trust. The day-to-day account can assemble and observe the platform, while a dedicated king credential receives final custody after the guided bootstrap path is ready.

T02 - Define King Credential Kit

id: NET-WP-0015-T02
status: done
priority: high
state_hub_task_id: "1a1c45a2-be66-4667-89f8-581f4fe9970b"

Define the first king credential kit: dedicated identity name, local/offline password-safe storage, second factor, recovery-code handling, no email secret transfer, no day-to-day browsing/Git use, and operator instructions clear enough for a non-expert.

2026-05-24: Defined the v1 kit in docs/security-bootstrap-king-credential-kit.md: label platform-root, setup operator/contact tegwick, notification-only email bernd.worsch@gmail.com, local password safe plus offline custody packet, TOTP/WebAuthn/hardware-token second factor, no day-to-day use, and no email or Git secret transfer. Added examples/security-bootstrap/king-credential-metadata.example.json plus console validation for non-secret kit metadata. Custody-mode approval remains blocked under T03.

T03 - Approve King Custody Mode

id: NET-WP-0015-T03
status: done
priority: high
state_hub_task_id: "56a6266a-4acd-41e6-a395-85e90a5c35c6"

Choose either the preferred independent two-of-three king custody model or an explicit temporary single-operator king credential exception for pre-production bootstrap. Do not run OpenBao initialization until this choice is recorded.

2026-05-24: Added local approval surfaces for this human gate: approve-custody-mode for the CLI and web-ui for the localhost console. Both write non-secret metadata only and keep live OpenBao initialization as a separate attended ceremony. Current recommended approval mode is temporary-single-king; two-of-three-planned records the target state but does not unblock live init.

2026-05-24: Tightened MFA handling after review: a TOTP QR code or setup key must come from the authority that will verify login, not from the local metadata console. Custody approval now requires explicit non-secret confirmation that the factor was enrolled with its real verifier.

2026-05-24: Clarified credential placement in the UI and custody docs: the dedicated king account currently belongs in the lightweight NetKingdom identity path (LLDAP user, Authelia login, privacyIDEA MFA, KeyCape OIDC). OpenBao is the secrets/audit/admin-policy custody service after the ceremony, not the place where the human password or OTP seed lives.

2026-05-24: Expanded the local UI toward a NetKingdom control surface: the bootstrap flow now has action buttons for LLDAP, privacyIDEA, and KeyCape, plus non-secret progress saving for account creation, MFA enrollment, OIDC verification, and custody approval.

2026-05-24: Clarified the LLDAP first-user path in the UI and docs: LLDAP has no registration flow; the operator logs in as bootstrap admin using LLDAP_LDAP_USER_PASS from net-kingdom/LLDAP/admin, then creates the dedicated platform-root or king account and assigns the current lightweight admin group.

2026-05-24: Added explicit non-secret UI confirmations for the account having been created, assigned to net-kingdom-admins, stored in the password safe/offline packet, and later verified through the login path. Automated LLDAP detection is deferred because it would require authenticated access to LLDAP and should be built as an audited integration.

2026-05-24: Improved the KeyCape login-check path: the local bootstrap UI now acts as the demo-app OIDC callback, exposes /oidc/start and /oidc/callback, and adds hover-help text to the external action buttons. The live KeyCape rollout still needs the updated keycape-config Secret applied from decrypted sso-mfa/bootstrap/secrets/ inputs. If the browser flow reaches Authelia but never presents an OTP challenge, KeyCape needs a browser MFA prompt surface before this gate can be marked verified.

2026-05-24: Filed KEY-WP-0003 in the KeyCape repo for the current OIDC verification blocker. The immediate error redirect_uri does not match any registered URI means the local bootstrap callback is not yet registered in live KeyCape. The follow-up KeyCape work also covers the browser OTP challenge needed after Authelia password login.

2026-05-24: Implemented KEY-WP-0003 in source. KeyCape now supports a dedicated netkingdom-bootstrap-console client, split browser/server Authelia URLs, and a browser OTP challenge before issuing the final OIDC code. The local control surface now uses that dedicated client. Live verification remains pending until the updated KeyCape image and regenerated keycape-config Secret are rolled out.

2026-05-24: Rolled the fix to the public Railiance SSO host (kc.coulomb.social, currently resolving to railiance01). The live keycape-config Secret was patched without printing or rotating secret values, the main-1d68639 KeyCape image was direct-imported into k3s, and the deployment was set to IfNotPresent. Public /authorize now accepts netkingdom-bootstrap-console and redirects to https://auth.coulomb.social/.... Follow-up: clean up the Gitea HTTP registry push/pull path so direct image import is no longer needed.

2026-05-24: Fixed the next live login failure before OTP: Authelia rejected KeyCape's token exchange because the upstream keycape client only permits client_secret_basic, while KeyCape was sending client_secret_post. KeyCape commit 56d279a now uses HTTP Basic auth for the upstream token exchange, the image main-56d279a was direct-imported into Railiance k3s, and the live deployment runs that tag.

2026-05-24: Fixed the follow-up mfa check error. Live privacyIDEA validation succeeds in the coulomb realm, while KeyCape had been configured for netkingdom and was also trying to pre-list tokens with an expired or invalid privacyIDEA admin JWT. KeyCape commit 937cb39 adds bootstrap mode privacyidea.requireForAll, which requires OTP for every authenticated user without depending on token-list admin credentials. The live keycape-config now uses realm: coulomb and requireForAll: true, and Railiance runs image main-937cb39.

2026-05-25: Fixed the subsequent token-exchange user not found error. Live LLDAP stores users under ou=people, while KeyCape's default lookup base was ou=users. KeyCape commit 06d20c3 makes the LLDAP OU settings explicit in YAML, live keycape-config now sets userOU: ou=people and groupOU: ou=groups, and Railiance runs image main-06d20c3.

2026-05-25: End-to-end OIDC login verification succeeded for platform-root. The local bootstrap-console callback exchanged the code and showed issuer https://kc.coulomb.social, audience netkingdom-bootstrap-console, subject uid=platform-root,ou=people,dc=netkingdom,dc=local, email bernd.worsch@gmail.com, and group net-kingdom-admins. Local non-secret bootstrap progress now records both MFA enrollment confirmation and OIDC login verification.

2026-05-25: Reworked the bootstrap-console flow after operator review. The UI now follows the use case top to bottom, hides hardware-token storage unless the selected policy uses hardware tokens, specifies the exact recovery material contents, distinguishes recovery material from the OpenBao custody packet, and turns "no secret capture" into an automatic control-surface boundary gate rather than a user checkbox.

2026-05-25: Corrected the custody/OpenBao ordering in the console: strategy selection now comes before recovery/packet preparation, the custody packet is prepared for the selected strategy before approval, and the OpenBao panel now explains when to run Railiance preflight, init/unseal, post-unseal configuration, root-token disposition, and restore proof. The console still refuses to capture root tokens or unseal shares.

2026-05-25: Restructured the bootstrap UI around the operator mental model: Roles & Responsibilities, Subsystems & Scope, Integration & Tests, and Artefacts & Locations. Role, subsystem, integration, and artefact rows now use the same name, description, subsystem, responsibility, location, and state fields, and console commands are shown as copyable command blocks.

2026-05-25: Refined the new model after operator review: role chips now sit under subsystem labels to keep artefact rows narrow, responsibility editing is inside a dirty-state Save/Cancel foldout, future quorum contact uses the same effective-value prefill as the role display, and command cards now derive blocked, todo, redo, or done status from bootstrap metadata.

2026-05-25: Added a Usecases & Runbooks section for trial-output exposure and key-material compromise. The UI now records non-secret compromise response state, separates "init output produced" from "initialized and unsealed", and adds guided command cards for unseal and OpenBao rotate-keys replacement share generation.

2026-05-25: Changed compromised/trial-exposed OpenBao material from a hard block into an explicit taint model. Affected artefacts and downstream command cards are shown with a light red background and retain the source reference, but the operator can still proceed deliberately on a tainted workpath.

2026-05-25: Split OpenBao initial configuration from root-token disposition in the bootstrap console. The initial config command can now be recorded as applied while root-token revocation/escrow remains a separate gate.

2026-05-25: Added an Emergency lock-down runbook for sealing Railiance OpenBao without placing tokens on the command line. Reordered the console into Introduction & Actors, Subsystems & Scopes, Roles & Responsibilities, Integration & Tests, Artefacts & Locations, Usecases & Runbooks, and Terminology & Patterns.

2026-05-25: Added Restore drill runbook action cards so the existing confirmation checkbox has a concrete path: prepare a restricted workspace, create/copy/hash an OpenBao Raft snapshot, encrypt it to the custodian age recipient, complete an isolated restore proof, rerun post-unseal verification, and record only non-secret completion evidence.

2026-05-25: Refined the action/runbook model in the control surface: Integration & Tests now carries stateful runbook tasks and gates, while Usecases & Runbooks contains status-less action cards and neutral runbook templates. Added copyable OpenBao inspection actions for bao audit list, bao secrets list, and bao auth list with local hidden token prompts, removed duplicate OpenBao status/unseal cards from the stateful Integration command list, and restored Artefacts & Locations above Usecases & Runbooks in the workflow.

2026-05-25: Added a five-stage visual stage rail and Final Handover section to close the gap between OpenBao bootstrap and the final operating state. The stage model now moves from S3 to S4 after OpenBao initial configuration, root-token disposition, and restore drill are complete, then to S5 only when the platform is explicitly reopened under custody.

2026-05-25: Corrected the OpenBao rotate-keys action cards after the operator hit permission denied on rotation init. The rotation commands now open an interactive pod TTY, prompt there for a root/sudo-capable OpenBao token, keep the token out of the local command line, and then run rotate init, share submission, or cancel.

2026-05-26: Added an explicit rotation-status action and clarified the rotation flow after the operator successfully started rotate-keys and then hit rotation already in progress by rerunning init. The UI now says init is a run-once step and that the next step is checking status or submitting existing shares with the nonce until quorum completes.

2026-05-26: Added a Usecases action card for creating the temporary Railiance OpenBao platform-admin token with bao token create -policy=platform-admin -period=24h -orphan. The command prompts for the bootstrap/root token without placing it on the command line and reminds the operator to store the emitted token through the approved secret path.

2026-05-26: Promoted the KeyCape-to-OpenBao admin path into its own stage before cleanup and hardening. The control surface now has S4 Admin Identity Integration with gates for the dedicated KeyCape OpenBao client, OpenBao OIDC/JWT auth configuration, and MFA-backed OpenBao admin login verification; cleanup and reopening move to S5/S6.

2026-05-26: Refined the OpenBao trial-exposure taint model so direct unseal-share taint clears after confirmed unseal-key rotation, and direct initial-root-token taint clears after the exposed OpenBao root token is revoked. Downstream work remains visibly tainted until derived access paths are reviewed and the compromise response is explicitly recorded complete.

2026-05-26: Split Admin Identity Integration into development-owned configuration and operator-owned integration work. The openbao-admin KeyCape client is now code-defined in sso-mfa/k8s/keycape/create-secrets.sh, while the UI action cards only ask the operator to apply live KeyCape config, configure OpenBao with a protected token prompt, and verify MFA-backed login.

2026-05-26: Hardened the KeyCape OpenBao client deployment action after the operator hit a non-executable create-secrets.sh. The action card now runs the script through bash, uses absolute repo paths, and wraps the sequence in a fail-fast heredoc so a failed config generation does not continue into a KeyCape restart or verification.

2026-05-26: Removed the KeyCape OpenBao client action's dependency on decrypted bootstrap secrets after the operator correctly hit the absent sso-mfa/bootstrap/secrets/ directory. Added a focused live Secret patcher and verifier for the openbao-admin client so this non-secret client addition can be applied without decrypting the full bootstrap secret bundle.

2026-05-26: Fixed the focused KeyCape OpenBao verifier after the live KeyCape image lacked wget. The verifier now checks the live Secret and then uses a short local kubectl port-forward plus Python HTTP request for OIDC discovery, avoiding assumptions about tools installed inside the KeyCape container.

2026-05-26: Fixed the OpenBao OIDC auth setup after OpenBao rejected an empty oidc_client_secret even though the current KeyCape openbao-admin client is public PKCE. The UI now points to a short helper script instead of a long nested shell/JSON command, and the helper writes an explicit non-secret compatibility value until KeyCape supports confidential downstream clients.

2026-05-24: Stepped back from ad hoc secret rollout and added the custodian age-key bootstrap model to the control surface. The UI now records the custodian public age recipient, a derived fingerprint, and a non-secret private-key custody reference while refusing to treat the private key as normal metadata. It also detects encrypted bootstrap bundle presence and plaintext sso-mfa/bootstrap/secrets/ exposure. This is the intended foundation for trial-mode, custody-mode, unlock/apply, and later OpenBao handover flows.

2026-05-26: Closed this custody-approval task after review against the live bootstrap metadata: platform-root is recorded as the king credential, MFA and KeyCape OIDC login are verified, and temporary-single-king custody is explicitly approved for the pre-production OpenBao bootstrap. Remaining hardening and user-onboarding readiness work is tracked in NET-WP-0017.

T04 - Complete Railiance OpenBao Bootstrap Ceremony

id: NET-WP-0015-T04
status: done
priority: high
state_hub_task_id: "2102366e-064b-4071-8b6a-574d9d37d109"

Coordinate with RAIL-PL-WP-0002-T03 to initialize and unseal OpenBao under the king credential model, enable audit and the first mounts/policies, create a non-root platform-admin access path, and revoke or offline-escrow the initial root token.

2026-05-26: Closed the bootstrap ceremony portion after live verification: Railiance OpenBao is initialized, unsealed, and post-unseal verified; initial configuration was applied; the initial OpenBao root token is recorded as revoked; trial unseal shares were rotated; and restore-drill confirmation is recorded in the bootstrap metadata. Declarative audit/durable audit shipping and routine OIDC admin access remain follow-up readiness gates under NET-WP-0017 and RAIL-PL-WP-0002.

T05 - Provision First NetKingdom Admin Identity

id: NET-WP-0015-T05
status: done
priority: high
state_hub_task_id: "d2a81d7b-9964-4bd5-9b8c-ef1324e02cd4"

Provision the first king/admin identity in the selected NetKingdom IAM implementation. The target claims are tenant=platform, principal_type=human or break_glass, MFA-backed assurance, and groups/roles for platform-root, platform-admin, netkingdom-admin, and railiance-platform-admin. tegwick may receive delegated day-to-day admin roles later, but must be revocable without losing root custody.

2026-05-26: Closed for the bootstrap identity scope: the dedicated platform-root user is recorded as created, assigned to net-kingdom-admins, stored outside this repo, enrolled for MFA, and verified through KeyCape OIDC. Richer IAM-profile claims for ordinary user onboarding remain part of the user-onboarding readiness work in NET-WP-0017.

T06 - Bind OpenBao Admin Auth To NetKingdom IAM

id: NET-WP-0015-T06
status: done
priority: medium
state_hub_task_id: "ef97f3cb-9792-4b9d-bd2b-8871d368a50f"

Replace temporary operator tokens with NetKingdom IAM-backed OpenBao admin auth when the issuer and claim mapping are ready. The OpenBao root token must not be the normal admin path.

2026-05-26: The KeyCape openbao-admin client was code-defined, patched into the live keycape-config Secret, rolled out, and verified without requiring decrypted bootstrap secrets. At that point, OpenBao auth/keycape still needed the fixed helper command and the MFA-backed bao login -method=oidc -path=keycape role=platform-admin path still needed verification.

2026-06-01: Added a guided bootstrap runbook action for the live privacyIDEA state-loss case encountered during OpenBao OIDC login testing. The new action recreates the coulomb realm, lldap-coulomb resolver, self-enrollment policy, and phase-one passthrough policy by prompting for pi-admin and LLDAP bind/admin passwords, writing them only to temporary files through repair-realm-live.sh, and running bootstrap-realm.sh plus verify-t06.sh. TOTP enrollment/re-enrollment and the final MFA-backed OpenBao login verification remain operator steps.

2026-06-01: Closed after the platform-root MFA-backed OpenBao OIDC login completed through KeyCape and the resulting token lookup showed platform-admin in both token policy fields. The remaining OpenBao hardening, audit, escrow, reset/rotation, and reopening gates continue under T07/T08 and NET-WP-0017.

2026-06-01: Added OpenBao token revocation to the guided Usecases & Runbooks section. The UI now includes a self-revoke card for the current pod token-helper token and an accessor-based revocation card for disclosed tokens, both keeping OpenBao token values off the local command line.

T07 - Verify Recovery, Audit, And Rotation

id: NET-WP-0015-T07
status: done
priority: medium
state_hub_task_id: "aa40cbb4-36d3-405d-b59d-0c21ae8c9539"

Confirm snapshot/restore drill, durable audit-log handling, root-token disposition, unseal/recovery rotation expectations, and the follow-up owner for adding at least one additional human escrow holder.

2026-05-26: Root-token disposition, unseal-key rotation, post-unseal verification, and restore-drill confirmation are recorded. This task remains open for declarative audit configuration/durable audit shipping, residual taint-response closeout, and the next independent escrow holder.

2026-06-01: Closed for the bootstrap handoff scope. The bootstrap plan has confirmed the available recovery/audit/rotation evidence and, more importantly, now has explicit production-readiness follow-up gates: NET-WP-0017-T02 owns declarative/durable audit, restore evidence, emergency seal/unseal drill evidence, and the next independent escrow holder; NET-WP-0017-T03 owns residual taint closeout. These items are no longer tracked as unfinished bootstrap ceremony work.

T08 - Reset, Rotate, And Reopen Under King Oversight

id: NET-WP-0015-T08
status: done
priority: high
state_hub_task_id: "e6a60dca-547b-4493-a36c-f6b668d1bf52"

After the king credential accepts custody, reset or rotate bootstrap-era database credentials, admin passwords, service tokens, OpenBao tokens, and temporary access paths. Run host/workload checks and reopen the platform only after the new custody state is verified.

2026-06-01: Closed as a bootstrap-plan handoff rather than as a claim that all production cleanup is complete. NET-WP-0017-T03 owns retirement of bootstrap admin paths and residual taint response, NET-WP-0017-T04 owns bootstrap-era credential rotation/reset plus host/workload checks, and NET-WP-0017-T07 owns final review and retirement/archive of superseded bootstrap workplans. NET-WP-0018 will turn those gates into a smoother bootstrap guide, control-surface automation, validations, and rebuild-risk assessment.

Closeout

2026-06-01: NET-WP-0015 is finished. The first safe bridge is in place: the dedicated platform-root identity exists outside day-to-day operator use, custody mode is recorded, OpenBao was initialized and configured under the bootstrap ceremony, the initial root token is not the normal admin path, and routine OpenBao administration now works through NetKingdom/KeyCape OIDC with MFA and the platform-admin policy. Remaining production-readiness work is explicitly tracked in NET-WP-0017; rebuild automation and validation improvements are tracked in NET-WP-0018.

Acceptance Criteria

  • The setup operator and king credential model are recorded without secret values.
  • The custody mode is explicit before OpenBao initialization.
  • OpenBao root-token use is limited to bootstrap or break-glass handling.
  • Routine admin access has a non-root path and a target NetKingdom IAM path.
  • Production readiness has a clear gate for independent escrow, audit, restore, reset/rotation, and reopening under king oversight.