Files
net-kingdom/workplans/NET-WP-0015-platform-root-custody-and-openbao-identity-bootstrap.md

18 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on, state_hub_workstream_id
id type title domain repo status owner topic_slug created updated depends_on state_hub_workstream_id
NET-WP-0015 workplan King Credential And OpenBao Identity Bootstrap netkingdom net-kingdom active codex netkingdom 2026-05-24 2026-05-24
NK-WP-0006
NK-WP-0012
6b9c25e4-1008-429a-8de6-54361872c0dd

NET-WP-0015 - King Credential And OpenBao Identity Bootstrap

Goal

Define and execute the first safe bridge between low-trust setup operations, a dedicated king credential, NetKingdom identity, and Railiance OpenBao bootstrap.

The revised decision is that tegwick / bernd.worsch@gmail.com is the initial accountable setup operator and notification contact, not the long-term platform root of trust. The actual platform-root authority should move to a separate king credential before OpenBao becomes live secret custody.

Context

Railiance owns OpenBao deployment and operations. NetKingdom owns the identity, custody, and security semantics that say who can administer the platform and how that authority transitions from bootstrap material into normal IAM claims.

The platform is still in MVP/prototype bootstrap. That means early databases, admin accounts, tokens, and access paths must be treated as potentially contaminated by convenience. The platform should be assembled in low-trust mode, then handed over to the king credential, reset/rotated, checked, and reopened under explicit custody.

Scope

In scope:

  • record the setup operator/contact identity;
  • define the separate king credential target;
  • define the temporary single-operator king custody exception;
  • specify target NetKingdom IAM claims for the first admin identity;
  • coordinate the OpenBao initialization prerequisites with Railiance;
  • define the transition from OpenBao root token to scoped admin access; and
  • add follow-up gates for independent escrow, OIDC/JWT admin auth, reset/rotation, scan checks, and restore verification.

Out of scope:

  • storing any secret material in this repo;
  • running bao operator init from an unattended agent session;
  • deploying key-cape, Keycloak, privacyIDEA, or OpenBao itself; and
  • granting tenant administrators platform-root authority.

Tasks

T01 - Record Setup Operator And King Credential Model

id: NET-WP-0015-T01
status: done
priority: high
state_hub_task_id: "60659e25-fed1-478e-b8a3-4bc7b2f3846b"

Record tegwick / bernd.worsch@gmail.com / Gitea tegwick as the initial setup operator and contact. Define the separate king credential as the actual platform-root target.

2026-05-24: Added docs/platform-root-custody.md and updated docs/platform-identity-security-architecture.md plus SCOPE.md.

2026-05-24: Revised the custody model: tegwick is no longer modeled as the platform root of trust. The day-to-day account can assemble and observe the platform, while a dedicated king credential receives final custody after the guided bootstrap path is ready.

T02 - Define King Credential Kit

id: NET-WP-0015-T02
status: done
priority: high
state_hub_task_id: "1a1c45a2-be66-4667-89f8-581f4fe9970b"

Define the first king credential kit: dedicated identity name, local/offline password-safe storage, second factor, recovery-code handling, no email secret transfer, no day-to-day browsing/Git use, and operator instructions clear enough for a non-expert.

2026-05-24: Defined the v1 kit in docs/security-bootstrap-king-credential-kit.md: label platform-root, setup operator/contact tegwick, notification-only email bernd.worsch@gmail.com, local password safe plus offline custody packet, TOTP/WebAuthn/hardware-token second factor, no day-to-day use, and no email or Git secret transfer. Added examples/security-bootstrap/king-credential-metadata.example.json plus console validation for non-secret kit metadata. Custody-mode approval remains blocked under T03.

T03 - Approve King Custody Mode

id: NET-WP-0015-T03
status: blocked
priority: high
state_hub_task_id: "56a6266a-4acd-41e6-a395-85e90a5c35c6"

Choose either the preferred independent two-of-three king custody model or an explicit temporary single-operator king credential exception for pre-production bootstrap. Do not run OpenBao initialization until this choice is recorded.

2026-05-24: Added local approval surfaces for this human gate: approve-custody-mode for the CLI and web-ui for the localhost console. Both write non-secret metadata only and keep live OpenBao initialization as a separate attended ceremony. Current recommended approval mode is temporary-single-king; two-of-three-planned records the target state but does not unblock live init.

2026-05-24: Tightened MFA handling after review: a TOTP QR code or setup key must come from the authority that will verify login, not from the local metadata console. Custody approval now requires explicit non-secret confirmation that the factor was enrolled with its real verifier.

2026-05-24: Clarified credential placement in the UI and custody docs: the dedicated king account currently belongs in the lightweight NetKingdom identity path (LLDAP user, Authelia login, privacyIDEA MFA, KeyCape OIDC). OpenBao is the secrets/audit/admin-policy custody service after the ceremony, not the place where the human password or OTP seed lives.

2026-05-24: Expanded the local UI toward a NetKingdom control surface: the bootstrap flow now has action buttons for LLDAP, privacyIDEA, and KeyCape, plus non-secret progress saving for account creation, MFA enrollment, OIDC verification, and custody approval.

2026-05-24: Clarified the LLDAP first-user path in the UI and docs: LLDAP has no registration flow; the operator logs in as bootstrap admin using LLDAP_LDAP_USER_PASS from net-kingdom/LLDAP/admin, then creates the dedicated platform-root or king account and assigns the current lightweight admin group.

2026-05-24: Added explicit non-secret UI confirmations for the account having been created, assigned to net-kingdom-admins, stored in the password safe/offline packet, and later verified through the login path. Automated LLDAP detection is deferred because it would require authenticated access to LLDAP and should be built as an audited integration.

2026-05-24: Improved the KeyCape login-check path: the local bootstrap UI now acts as the demo-app OIDC callback, exposes /oidc/start and /oidc/callback, and adds hover-help text to the external action buttons. The live KeyCape rollout still needs the updated keycape-config Secret applied from decrypted sso-mfa/bootstrap/secrets/ inputs. If the browser flow reaches Authelia but never presents an OTP challenge, KeyCape needs a browser MFA prompt surface before this gate can be marked verified.

2026-05-24: Filed KEY-WP-0003 in the KeyCape repo for the current OIDC verification blocker. The immediate error redirect_uri does not match any registered URI means the local bootstrap callback is not yet registered in live KeyCape. The follow-up KeyCape work also covers the browser OTP challenge needed after Authelia password login.

2026-05-24: Implemented KEY-WP-0003 in source. KeyCape now supports a dedicated netkingdom-bootstrap-console client, split browser/server Authelia URLs, and a browser OTP challenge before issuing the final OIDC code. The local control surface now uses that dedicated client. Live verification remains pending until the updated KeyCape image and regenerated keycape-config Secret are rolled out.

2026-05-24: Rolled the fix to the public Railiance SSO host (kc.coulomb.social, currently resolving to railiance01). The live keycape-config Secret was patched without printing or rotating secret values, the main-1d68639 KeyCape image was direct-imported into k3s, and the deployment was set to IfNotPresent. Public /authorize now accepts netkingdom-bootstrap-console and redirects to https://auth.coulomb.social/.... Follow-up: clean up the Gitea HTTP registry push/pull path so direct image import is no longer needed.

2026-05-24: Fixed the next live login failure before OTP: Authelia rejected KeyCape's token exchange because the upstream keycape client only permits client_secret_basic, while KeyCape was sending client_secret_post. KeyCape commit 56d279a now uses HTTP Basic auth for the upstream token exchange, the image main-56d279a was direct-imported into Railiance k3s, and the live deployment runs that tag.

2026-05-24: Fixed the follow-up mfa check error. Live privacyIDEA validation succeeds in the coulomb realm, while KeyCape had been configured for netkingdom and was also trying to pre-list tokens with an expired or invalid privacyIDEA admin JWT. KeyCape commit 937cb39 adds bootstrap mode privacyidea.requireForAll, which requires OTP for every authenticated user without depending on token-list admin credentials. The live keycape-config now uses realm: coulomb and requireForAll: true, and Railiance runs image main-937cb39.

2026-05-25: Fixed the subsequent token-exchange user not found error. Live LLDAP stores users under ou=people, while KeyCape's default lookup base was ou=users. KeyCape commit 06d20c3 makes the LLDAP OU settings explicit in YAML, live keycape-config now sets userOU: ou=people and groupOU: ou=groups, and Railiance runs image main-06d20c3.

2026-05-25: End-to-end OIDC login verification succeeded for platform-root. The local bootstrap-console callback exchanged the code and showed issuer https://kc.coulomb.social, audience netkingdom-bootstrap-console, subject uid=platform-root,ou=people,dc=netkingdom,dc=local, email bernd.worsch@gmail.com, and group net-kingdom-admins. Local non-secret bootstrap progress now records both MFA enrollment confirmation and OIDC login verification.

2026-05-25: Reworked the bootstrap-console flow after operator review. The UI now follows the use case top to bottom, hides hardware-token storage unless the selected policy uses hardware tokens, specifies the exact recovery material contents, distinguishes recovery material from the OpenBao custody packet, and turns "no secret capture" into an automatic control-surface boundary gate rather than a user checkbox.

2026-05-25: Corrected the custody/OpenBao ordering in the console: strategy selection now comes before recovery/packet preparation, the custody packet is prepared for the selected strategy before approval, and the OpenBao panel now explains when to run Railiance preflight, init/unseal, post-unseal configuration, root-token disposition, and restore proof. The console still refuses to capture root tokens or unseal shares.

2026-05-25: Restructured the bootstrap UI around the operator mental model: Roles & Responsibilities, Subsystems & Scope, Integration & Tests, and Artefacts & Locations. Role, subsystem, integration, and artefact rows now use the same name, description, subsystem, responsibility, location, and state fields, and console commands are shown as copyable command blocks.

2026-05-25: Refined the new model after operator review: role chips now sit under subsystem labels to keep artefact rows narrow, responsibility editing is inside a dirty-state Save/Cancel foldout, future quorum contact uses the same effective-value prefill as the role display, and command cards now derive blocked, todo, redo, or done status from bootstrap metadata.

2026-05-25: Added a Usecases & Runbooks section for trial-output exposure and key-material compromise. The UI now records non-secret compromise response state, separates "init output produced" from "initialized and unsealed", and adds guided command cards for unseal and OpenBao rotate-keys replacement share generation.

2026-05-25: Changed compromised/trial-exposed OpenBao material from a hard block into an explicit taint model. Affected artefacts and downstream command cards are shown with a light red background and retain the source reference, but the operator can still proceed deliberately on a tainted workpath.

2026-05-25: Split OpenBao initial configuration from root-token disposition in the bootstrap console. The initial config command can now be recorded as applied while root-token revocation/escrow remains a separate gate.

2026-05-25: Added an Emergency lock-down runbook for sealing Railiance OpenBao without placing tokens on the command line. Reordered the console into Introduction & Actors, Subsystems & Scopes, Roles & Responsibilities, Integration & Tests, Artefacts & Locations, Usecases & Runbooks, and Terminology & Patterns.

2026-05-25: Added Restore drill runbook action cards so the existing confirmation checkbox has a concrete path: prepare a restricted workspace, create/copy/hash an OpenBao Raft snapshot, encrypt it to the custodian age recipient, complete an isolated restore proof, rerun post-unseal verification, and record only non-secret completion evidence.

2026-05-25: Refined the action/runbook model in the control surface: Integration & Tests now carries stateful runbook tasks and gates, while Usecases & Runbooks contains status-less action cards and neutral runbook templates. Added copyable OpenBao inspection actions for bao audit list, bao secrets list, and bao auth list with local hidden token prompts, removed duplicate OpenBao status/unseal cards from the stateful Integration command list, and restored Artefacts & Locations above Usecases & Runbooks in the workflow.

2026-05-25: Added a five-stage visual stage rail and Final Handover section to close the gap between OpenBao bootstrap and the final operating state. The stage model now moves from S3 to S4 after OpenBao initial configuration, root-token disposition, and restore drill are complete, then to S5 only when the platform is explicitly reopened under custody.

2026-05-25: Corrected the OpenBao rotate-keys action cards after the operator hit permission denied on rotation init. The rotation commands now open an interactive pod TTY, prompt there for a root/sudo-capable OpenBao token, keep the token out of the local command line, and then run rotate init, share submission, or cancel.

2026-05-26: Added an explicit rotation-status action and clarified the rotation flow after the operator successfully started rotate-keys and then hit rotation already in progress by rerunning init. The UI now says init is a run-once step and that the next step is checking status or submitting existing shares with the nonce until quorum completes.

2026-05-26: Added a Usecases action card for creating the temporary Railiance OpenBao platform-admin token with bao token create -policy=platform-admin -period=24h -orphan. The command prompts for the bootstrap/root token without placing it on the command line and reminds the operator to store the emitted token through the approved secret path.

2026-05-26: Promoted the KeyCape-to-OpenBao admin path into its own stage before cleanup and hardening. The control surface now has S4 Admin Identity Integration with gates for the dedicated KeyCape OpenBao client, OpenBao OIDC/JWT auth configuration, and MFA-backed OpenBao admin login verification; cleanup and reopening move to S5/S6.

2026-05-26: Refined the OpenBao trial-exposure taint model so direct unseal-share taint clears after confirmed unseal-key rotation, and direct initial-root-token taint clears after the exposed OpenBao root token is revoked. Downstream work remains visibly tainted until derived access paths are reviewed and the compromise response is explicitly recorded complete.

2026-05-24: Stepped back from ad hoc secret rollout and added the custodian age-key bootstrap model to the control surface. The UI now records the custodian public age recipient, a derived fingerprint, and a non-secret private-key custody reference while refusing to treat the private key as normal metadata. It also detects encrypted bootstrap bundle presence and plaintext sso-mfa/bootstrap/secrets/ exposure. This is the intended foundation for trial-mode, custody-mode, unlock/apply, and later OpenBao handover flows.

T04 - Complete Railiance OpenBao Bootstrap Ceremony

id: NET-WP-0015-T04
status: blocked
priority: high
state_hub_task_id: "2102366e-064b-4071-8b6a-574d9d37d109"

Coordinate with RAIL-PL-WP-0002-T03 to initialize and unseal OpenBao under the king credential model, enable audit and the first mounts/policies, create a non-root platform-admin access path, and revoke or offline-escrow the initial root token.

T05 - Provision First NetKingdom Admin Identity

id: NET-WP-0015-T05
status: todo
priority: high
state_hub_task_id: "d2a81d7b-9964-4bd5-9b8c-ef1324e02cd4"

Provision the first king/admin identity in the selected NetKingdom IAM implementation. The target claims are tenant=platform, principal_type=human or break_glass, MFA-backed assurance, and groups/roles for platform-root, platform-admin, netkingdom-admin, and railiance-platform-admin. tegwick may receive delegated day-to-day admin roles later, but must be revocable without losing root custody.

T06 - Bind OpenBao Admin Auth To NetKingdom IAM

id: NET-WP-0015-T06
status: in_progress
priority: medium
state_hub_task_id: "ef97f3cb-9792-4b9d-bd2b-8871d368a50f"

Replace temporary operator tokens with NetKingdom IAM-backed OpenBao admin auth when the issuer and claim mapping are ready. The OpenBao root token must not be the normal admin path.

T07 - Verify Recovery, Audit, And Rotation

id: NET-WP-0015-T07
status: todo
priority: medium
state_hub_task_id: "aa40cbb4-36d3-405d-b59d-0c21ae8c9539"

Confirm snapshot/restore drill, durable audit-log handling, root-token disposition, unseal/recovery rotation expectations, and the follow-up owner for adding at least one additional human escrow holder.

T08 - Reset, Rotate, And Reopen Under King Oversight

id: NET-WP-0015-T08
status: todo
priority: high
state_hub_task_id: "e6a60dca-547b-4493-a36c-f6b668d1bf52"

After the king credential accepts custody, reset or rotate bootstrap-era database credentials, admin passwords, service tokens, OpenBao tokens, and temporary access paths. Run host/workload checks and reopen the platform only after the new custody state is verified.

Acceptance Criteria

  • The setup operator and king credential model are recorded without secret values.
  • The custody mode is explicit before OpenBao initialization.
  • OpenBao root-token use is limited to bootstrap or break-glass handling.
  • Routine admin access has a non-root path and a target NetKingdom IAM path.
  • Production readiness has a clear gate for independent escrow, audit, restore, reset/rotation, and reopening under king oversight.