Files
net-kingdom/workplans/NET-WP-0015-platform-root-custody-and-openbao-identity-bootstrap.md

515 lines
25 KiB
Markdown

---
id: NET-WP-0015
type: workplan
title: "King Credential And OpenBao Identity Bootstrap"
domain: netkingdom
repo: net-kingdom
status: finished
owner: codex
topic_slug: netkingdom
created: "2026-05-24"
updated: "2026-06-01"
depends_on:
- NK-WP-0006
- NK-WP-0012
state_hub_workstream_id: "6b9c25e4-1008-429a-8de6-54361872c0dd"
---
# NET-WP-0015 - King Credential And OpenBao Identity Bootstrap
## Goal
Define and execute the first safe bridge between low-trust setup operations, a
dedicated king credential, NetKingdom identity, and Railiance OpenBao
bootstrap.
The revised decision is that `tegwick` / `bernd.worsch@gmail.com` is the
initial accountable setup operator and notification contact, not the long-term
platform root of trust. The actual platform-root authority should move to a
separate king credential before OpenBao becomes live secret custody.
## Context
Railiance owns OpenBao deployment and operations. NetKingdom owns the identity,
custody, and security semantics that say who can administer the platform and
how that authority transitions from bootstrap material into normal IAM claims.
The platform is still in MVP/prototype bootstrap. That means early databases,
admin accounts, tokens, and access paths must be treated as potentially
contaminated by convenience. The platform should be assembled in low-trust
mode, then handed over to the king credential, reset/rotated, checked, and
reopened under explicit custody.
## Scope
In scope:
- record the setup operator/contact identity;
- define the separate king credential target;
- define the temporary single-operator king custody exception;
- specify target NetKingdom IAM claims for the first admin identity;
- coordinate the OpenBao initialization prerequisites with Railiance;
- define the transition from OpenBao root token to scoped admin access; and
- add follow-up gates for independent escrow, OIDC/JWT admin auth,
reset/rotation, scan checks, and restore verification.
Out of scope:
- storing any secret material in this repo;
- running `bao operator init` from an unattended agent session;
- deploying key-cape, Keycloak, privacyIDEA, or OpenBao itself; and
- granting tenant administrators platform-root authority.
## Tasks
### T01 - Record Setup Operator And King Credential Model
```task
id: NET-WP-0015-T01
status: done
priority: high
state_hub_task_id: "60659e25-fed1-478e-b8a3-4bc7b2f3846b"
```
Record `tegwick` / `bernd.worsch@gmail.com` / Gitea `tegwick` as the initial
setup operator and contact. Define the separate king credential as the actual
platform-root target.
**2026-05-24:** Added `docs/platform-root-custody.md` and updated
`docs/platform-identity-security-architecture.md` plus `SCOPE.md`.
**2026-05-24:** Revised the custody model: `tegwick` is no longer modeled as
the platform root of trust. The day-to-day account can assemble and observe the
platform, while a dedicated king credential receives final custody after the
guided bootstrap path is ready.
### T02 - Define King Credential Kit
```task
id: NET-WP-0015-T02
status: done
priority: high
state_hub_task_id: "1a1c45a2-be66-4667-89f8-581f4fe9970b"
```
Define the first king credential kit: dedicated identity name, local/offline
password-safe storage, second factor, recovery-code handling, no email secret
transfer, no day-to-day browsing/Git use, and operator instructions clear
enough for a non-expert.
**2026-05-24:** Defined the v1 kit in
`docs/security-bootstrap-king-credential-kit.md`: label `platform-root`, setup
operator/contact `tegwick`, notification-only email
`bernd.worsch@gmail.com`, local password safe plus offline custody packet,
TOTP/WebAuthn/hardware-token second factor, no day-to-day use, and no email or
Git secret transfer. Added
`examples/security-bootstrap/king-credential-metadata.example.json` plus
console validation for non-secret kit metadata. Custody-mode approval remains
blocked under T03.
### T03 - Approve King Custody Mode
```task
id: NET-WP-0015-T03
status: done
priority: high
state_hub_task_id: "56a6266a-4acd-41e6-a395-85e90a5c35c6"
```
Choose either the preferred independent two-of-three king custody model or an
explicit temporary single-operator king credential exception for pre-production
bootstrap. Do not run OpenBao initialization until this choice is recorded.
**2026-05-24:** Added local approval surfaces for this human gate:
`approve-custody-mode` for the CLI and `web-ui` for the localhost console.
Both write non-secret metadata only and keep live OpenBao initialization as a
separate attended ceremony. Current recommended approval mode is
`temporary-single-king`; `two-of-three-planned` records the target state but
does not unblock live init.
**2026-05-24:** Tightened MFA handling after review: a TOTP QR code or setup
key must come from the authority that will verify login, not from the local
metadata console. Custody approval now requires explicit non-secret
confirmation that the factor was enrolled with its real verifier.
**2026-05-24:** Clarified credential placement in the UI and custody docs:
the dedicated king account currently belongs in the lightweight NetKingdom
identity path (LLDAP user, Authelia login, privacyIDEA MFA, KeyCape OIDC).
OpenBao is the secrets/audit/admin-policy custody service after the ceremony,
not the place where the human password or OTP seed lives.
**2026-05-24:** Expanded the local UI toward a NetKingdom control surface:
the bootstrap flow now has action buttons for LLDAP, privacyIDEA, and KeyCape,
plus non-secret progress saving for account creation, MFA enrollment, OIDC
verification, and custody approval.
**2026-05-24:** Clarified the LLDAP first-user path in the UI and docs:
LLDAP has no registration flow; the operator logs in as bootstrap `admin`
using `LLDAP_LDAP_USER_PASS` from `net-kingdom/LLDAP/admin`, then creates the
dedicated `platform-root` or `king` account and assigns the current lightweight
admin group.
**2026-05-24:** Added explicit non-secret UI confirmations for the account
having been created, assigned to `net-kingdom-admins`, stored in the password
safe/offline packet, and later verified through the login path. Automated
LLDAP detection is deferred because it would require authenticated access to
LLDAP and should be built as an audited integration.
**2026-05-24:** Improved the KeyCape login-check path: the local bootstrap UI
now acts as the `demo-app` OIDC callback, exposes `/oidc/start` and
`/oidc/callback`, and adds hover-help text to the external action buttons.
The live KeyCape rollout still needs the updated `keycape-config` Secret
applied from decrypted `sso-mfa/bootstrap/secrets/` inputs. If the browser
flow reaches Authelia but never presents an OTP challenge, KeyCape needs a
browser MFA prompt surface before this gate can be marked verified.
**2026-05-24:** Filed `KEY-WP-0003` in the KeyCape repo for the current OIDC
verification blocker. The immediate error
`redirect_uri does not match any registered URI` means the local bootstrap
callback is not yet registered in live KeyCape. The follow-up KeyCape work also
covers the browser OTP challenge needed after Authelia password login.
**2026-05-24:** Implemented `KEY-WP-0003` in source. KeyCape now supports a
dedicated `netkingdom-bootstrap-console` client, split browser/server Authelia
URLs, and a browser OTP challenge before issuing the final OIDC code. The local
control surface now uses that dedicated client. Live verification remains
pending until the updated KeyCape image and regenerated `keycape-config` Secret
are rolled out.
**2026-05-24:** Rolled the fix to the public Railiance SSO host
(`kc.coulomb.social`, currently resolving to `railiance01`). The live
`keycape-config` Secret was patched without printing or rotating secret values,
the `main-1d68639` KeyCape image was direct-imported into k3s, and the
deployment was set to `IfNotPresent`. Public `/authorize` now accepts
`netkingdom-bootstrap-console` and redirects to
`https://auth.coulomb.social/...`. Follow-up: clean up the Gitea HTTP registry
push/pull path so direct image import is no longer needed.
**2026-05-24:** Fixed the next live login failure before OTP: Authelia rejected
KeyCape's token exchange because the upstream `keycape` client only permits
`client_secret_basic`, while KeyCape was sending `client_secret_post`. KeyCape
commit `56d279a` now uses HTTP Basic auth for the upstream token exchange, the
image `main-56d279a` was direct-imported into Railiance k3s, and the live
deployment runs that tag.
**2026-05-24:** Fixed the follow-up `mfa check error`. Live privacyIDEA
validation succeeds in the `coulomb` realm, while KeyCape had been configured
for `netkingdom` and was also trying to pre-list tokens with an expired or
invalid privacyIDEA admin JWT. KeyCape commit `937cb39` adds bootstrap mode
`privacyidea.requireForAll`, which requires OTP for every authenticated user
without depending on token-list admin credentials. The live `keycape-config`
now uses `realm: coulomb` and `requireForAll: true`, and Railiance runs image
`main-937cb39`.
**2026-05-25:** Fixed the subsequent token-exchange `user not found` error.
Live LLDAP stores users under `ou=people`, while KeyCape's default lookup base
was `ou=users`. KeyCape commit `06d20c3` makes the LLDAP OU settings explicit
in YAML, live `keycape-config` now sets `userOU: ou=people` and
`groupOU: ou=groups`, and Railiance runs image `main-06d20c3`.
**2026-05-25:** End-to-end OIDC login verification succeeded for
`platform-root`. The local bootstrap-console callback exchanged the code and
showed issuer `https://kc.coulomb.social`, audience
`netkingdom-bootstrap-console`, subject
`uid=platform-root,ou=people,dc=netkingdom,dc=local`, email
`bernd.worsch@gmail.com`, and group `net-kingdom-admins`. Local non-secret
bootstrap progress now records both MFA enrollment confirmation and OIDC login
verification.
**2026-05-25:** Reworked the bootstrap-console flow after operator review. The
UI now follows the use case top to bottom, hides hardware-token storage unless
the selected policy uses hardware tokens, specifies the exact recovery material
contents, distinguishes recovery material from the OpenBao custody packet, and
turns "no secret capture" into an automatic control-surface boundary gate
rather than a user checkbox.
**2026-05-25:** Corrected the custody/OpenBao ordering in the console:
strategy selection now comes before recovery/packet preparation, the custody
packet is prepared for the selected strategy before approval, and the OpenBao
panel now explains when to run Railiance preflight, init/unseal,
post-unseal configuration, root-token disposition, and restore proof. The
console still refuses to capture root tokens or unseal shares.
**2026-05-25:** Restructured the bootstrap UI around the operator mental model:
Roles & Responsibilities, Subsystems & Scope, Integration & Tests, and
Artefacts & Locations. Role, subsystem, integration, and artefact rows now use
the same `name`, `description`, `subsystem`, `responsibility`, `location`, and
`state` fields, and console commands are shown as copyable command blocks.
**2026-05-25:** Refined the new model after operator review: role chips now sit
under subsystem labels to keep artefact rows narrow, responsibility editing is
inside a dirty-state Save/Cancel foldout, future quorum contact uses the same
effective-value prefill as the role display, and command cards now derive
`blocked`, `todo`, `redo`, or `done` status from bootstrap metadata.
**2026-05-25:** Added a Usecases & Runbooks section for trial-output exposure
and key-material compromise. The UI now records non-secret compromise response
state, separates "init output produced" from "initialized and unsealed", and
adds guided command cards for unseal and OpenBao `rotate-keys` replacement
share generation.
**2026-05-25:** Changed compromised/trial-exposed OpenBao material from a hard
block into an explicit taint model. Affected artefacts and downstream command
cards are shown with a light red background and retain the source reference, but
the operator can still proceed deliberately on a tainted workpath.
**2026-05-25:** Split OpenBao initial configuration from root-token disposition
in the bootstrap console. The initial config command can now be recorded as
applied while root-token revocation/escrow remains a separate gate.
**2026-05-25:** Added an Emergency lock-down runbook for sealing Railiance
OpenBao without placing tokens on the command line. Reordered the console into
Introduction & Actors, Subsystems & Scopes, Roles & Responsibilities,
Integration & Tests, Artefacts & Locations, Usecases & Runbooks, and
Terminology & Patterns.
**2026-05-25:** Added Restore drill runbook action cards so the existing
confirmation checkbox has a concrete path: prepare a restricted workspace,
create/copy/hash an OpenBao Raft snapshot, encrypt it to the custodian age
recipient, complete an isolated restore proof, rerun post-unseal verification,
and record only non-secret completion evidence.
**2026-05-25:** Refined the action/runbook model in the control surface:
Integration & Tests now carries stateful runbook tasks and gates, while
Usecases & Runbooks contains status-less action cards and neutral runbook
templates. Added copyable OpenBao inspection actions for `bao audit list`,
`bao secrets list`, and `bao auth list` with local hidden token prompts,
removed duplicate OpenBao status/unseal cards from the stateful Integration
command list, and restored Artefacts & Locations above Usecases & Runbooks in
the workflow.
**2026-05-25:** Added a five-stage visual stage rail and Final Handover
section to close the gap between OpenBao bootstrap and the final operating
state. The stage model now moves from S3 to S4 after OpenBao initial
configuration, root-token disposition, and restore drill are complete, then to
S5 only when the platform is explicitly reopened under custody.
**2026-05-25:** Corrected the OpenBao rotate-keys action cards after the
operator hit `permission denied` on rotation init. The rotation commands now
open an interactive pod TTY, prompt there for a root/sudo-capable OpenBao
token, keep the token out of the local command line, and then run rotate init,
share submission, or cancel.
**2026-05-26:** Added an explicit rotation-status action and clarified the
rotation flow after the operator successfully started rotate-keys and then hit
`rotation already in progress` by rerunning init. The UI now says init is a
run-once step and that the next step is checking status or submitting existing
shares with the nonce until quorum completes.
**2026-05-26:** Added a Usecases action card for creating the temporary
Railiance OpenBao `platform-admin` token with
`bao token create -policy=platform-admin -period=24h -orphan`. The command
prompts for the bootstrap/root token without placing it on the command line
and reminds the operator to store the emitted token through the approved secret
path.
**2026-05-26:** Promoted the KeyCape-to-OpenBao admin path into its own stage
before cleanup and hardening. The control surface now has S4 Admin Identity
Integration with gates for the dedicated KeyCape OpenBao client, OpenBao
OIDC/JWT auth configuration, and MFA-backed OpenBao admin login verification;
cleanup and reopening move to S5/S6.
**2026-05-26:** Refined the OpenBao trial-exposure taint model so direct
unseal-share taint clears after confirmed unseal-key rotation, and direct
initial-root-token taint clears after the exposed OpenBao root token is
revoked. Downstream work remains visibly tainted until derived access paths
are reviewed and the compromise response is explicitly recorded complete.
**2026-05-26:** Split Admin Identity Integration into development-owned
configuration and operator-owned integration work. The `openbao-admin` KeyCape
client is now code-defined in `sso-mfa/k8s/keycape/create-secrets.sh`, while
the UI action cards only ask the operator to apply live KeyCape config,
configure OpenBao with a protected token prompt, and verify MFA-backed login.
**2026-05-26:** Hardened the KeyCape OpenBao client deployment action after the
operator hit a non-executable `create-secrets.sh`. The action card now runs the
script through `bash`, uses absolute repo paths, and wraps the sequence in a
fail-fast heredoc so a failed config generation does not continue into a
KeyCape restart or verification.
**2026-05-26:** Removed the KeyCape OpenBao client action's dependency on
decrypted bootstrap secrets after the operator correctly hit the absent
`sso-mfa/bootstrap/secrets/` directory. Added a focused live Secret patcher and
verifier for the `openbao-admin` client so this non-secret client addition can
be applied without decrypting the full bootstrap secret bundle.
**2026-05-26:** Fixed the focused KeyCape OpenBao verifier after the live
KeyCape image lacked `wget`. The verifier now checks the live Secret and then
uses a short local `kubectl port-forward` plus Python HTTP request for OIDC
discovery, avoiding assumptions about tools installed inside the KeyCape
container.
**2026-05-26:** Fixed the OpenBao OIDC auth setup after OpenBao rejected an
empty `oidc_client_secret` even though the current KeyCape `openbao-admin`
client is public PKCE. The UI now points to a short helper script instead of a
long nested shell/JSON command, and the helper writes an explicit non-secret
compatibility value until KeyCape supports confidential downstream clients.
**2026-05-24:** Stepped back from ad hoc secret rollout and added the
custodian age-key bootstrap model to the control surface. The UI now records
the custodian public age recipient, a derived fingerprint, and a non-secret
private-key custody reference while refusing to treat the private key as normal
metadata. It also detects encrypted bootstrap bundle presence and plaintext
`sso-mfa/bootstrap/secrets/` exposure. This is the intended foundation for
trial-mode, custody-mode, unlock/apply, and later OpenBao handover flows.
**2026-05-26:** Closed this custody-approval task after review against the
live bootstrap metadata: `platform-root` is recorded as the king credential,
MFA and KeyCape OIDC login are verified, and `temporary-single-king` custody is
explicitly approved for the pre-production OpenBao bootstrap. Remaining
hardening and user-onboarding readiness work is tracked in `NET-WP-0017`.
### T04 - Complete Railiance OpenBao Bootstrap Ceremony
```task
id: NET-WP-0015-T04
status: done
priority: high
state_hub_task_id: "2102366e-064b-4071-8b6a-574d9d37d109"
```
Coordinate with `RAIL-PL-WP-0002-T03` to initialize and unseal OpenBao under
the king credential model, enable audit and the first mounts/policies, create a
non-root `platform-admin` access path, and revoke or offline-escrow the initial
root token.
**2026-05-26:** Closed the bootstrap ceremony portion after live verification:
Railiance OpenBao is initialized, unsealed, and post-unseal verified; initial
configuration was applied; the initial OpenBao root token is recorded as
revoked; trial unseal shares were rotated; and restore-drill confirmation is
recorded in the bootstrap metadata. Declarative audit/durable audit shipping
and routine OIDC admin access remain follow-up readiness gates under
`NET-WP-0017` and `RAIL-PL-WP-0002`.
### T05 - Provision First NetKingdom Admin Identity
```task
id: NET-WP-0015-T05
status: done
priority: high
state_hub_task_id: "d2a81d7b-9964-4bd5-9b8c-ef1324e02cd4"
```
Provision the first king/admin identity in the selected NetKingdom IAM
implementation. The target claims are `tenant=platform`,
`principal_type=human` or `break_glass`, MFA-backed assurance, and groups/roles
for `platform-root`, `platform-admin`, `netkingdom-admin`, and
`railiance-platform-admin`. `tegwick` may receive delegated day-to-day admin
roles later, but must be revocable without losing root custody.
**2026-05-26:** Closed for the bootstrap identity scope: the dedicated
`platform-root` user is recorded as created, assigned to
`net-kingdom-admins`, stored outside this repo, enrolled for MFA, and verified
through KeyCape OIDC. Richer IAM-profile claims for ordinary user onboarding
remain part of the user-onboarding readiness work in `NET-WP-0017`.
### T06 - Bind OpenBao Admin Auth To NetKingdom IAM
```task
id: NET-WP-0015-T06
status: done
priority: medium
state_hub_task_id: "ef97f3cb-9792-4b9d-bd2b-8871d368a50f"
```
Replace temporary operator tokens with NetKingdom IAM-backed OpenBao admin
auth when the issuer and claim mapping are ready. The OpenBao root token must
not be the normal admin path.
**2026-05-26:** The KeyCape `openbao-admin` client was code-defined, patched
into the live `keycape-config` Secret, rolled out, and verified without
requiring decrypted bootstrap secrets. At that point, OpenBao `auth/keycape`
still needed the fixed helper command and the MFA-backed
`bao login -method=oidc -path=keycape role=platform-admin` path still needed
verification.
**2026-06-01:** Added a guided bootstrap runbook action for the live
privacyIDEA state-loss case encountered during OpenBao OIDC login testing. The
new action recreates the `coulomb` realm, `lldap-coulomb` resolver,
self-enrollment policy, and phase-one passthrough policy by prompting for
`pi-admin` and LLDAP bind/admin passwords, writing them only to temporary
files through `repair-realm-live.sh`, and running `bootstrap-realm.sh` plus
`verify-t06.sh`. TOTP enrollment/re-enrollment and the final MFA-backed
OpenBao login verification remain operator steps.
**2026-06-01:** Closed after the `platform-root` MFA-backed OpenBao OIDC login
completed through KeyCape and the resulting token lookup showed
`platform-admin` in both token policy fields. The remaining OpenBao hardening,
audit, escrow, reset/rotation, and reopening gates continue under T07/T08 and
`NET-WP-0017`.
**2026-06-01:** Added OpenBao token revocation to the guided
Usecases & Runbooks section. The UI now includes a self-revoke card for the
current pod token-helper token and an accessor-based revocation card for
disclosed tokens, both keeping OpenBao token values off the local command line.
### T07 - Verify Recovery, Audit, And Rotation
```task
id: NET-WP-0015-T07
status: done
priority: medium
state_hub_task_id: "aa40cbb4-36d3-405d-b59d-0c21ae8c9539"
```
Confirm snapshot/restore drill, durable audit-log handling, root-token
disposition, unseal/recovery rotation expectations, and the follow-up owner
for adding at least one additional human escrow holder.
**2026-05-26:** Root-token disposition, unseal-key rotation, post-unseal
verification, and restore-drill confirmation are recorded. This task remains
open for declarative audit configuration/durable audit shipping, residual
taint-response closeout, and the next independent escrow holder.
**2026-06-01:** Closed for the bootstrap handoff scope. The bootstrap plan has
confirmed the available recovery/audit/rotation evidence and, more
importantly, now has explicit production-readiness follow-up gates:
`NET-WP-0017-T02` owns declarative/durable audit, restore evidence,
emergency seal/unseal drill evidence, and the next independent escrow holder;
`NET-WP-0017-T03` owns residual taint closeout. These items are no longer
tracked as unfinished bootstrap ceremony work.
### T08 - Reset, Rotate, And Reopen Under King Oversight
```task
id: NET-WP-0015-T08
status: done
priority: high
state_hub_task_id: "e6a60dca-547b-4493-a36c-f6b668d1bf52"
```
After the king credential accepts custody, reset or rotate bootstrap-era
database credentials, admin passwords, service tokens, OpenBao tokens, and
temporary access paths. Run host/workload checks and reopen the platform only
after the new custody state is verified.
**2026-06-01:** Closed as a bootstrap-plan handoff rather than as a claim that
all production cleanup is complete. `NET-WP-0017-T03` owns retirement of
bootstrap admin paths and residual taint response, `NET-WP-0017-T04` owns
bootstrap-era credential rotation/reset plus host/workload checks, and
`NET-WP-0017-T07` owns final review and retirement/archive of superseded
bootstrap workplans. `NET-WP-0018` will turn those gates into a smoother
bootstrap guide, control-surface automation, validations, and rebuild-risk
assessment.
## Closeout
**2026-06-01:** `NET-WP-0015` is finished. The first safe bridge is in place:
the dedicated `platform-root` identity exists outside day-to-day operator use,
custody mode is recorded, OpenBao was initialized and configured under the
bootstrap ceremony, the initial root token is not the normal admin path, and
routine OpenBao administration now works through NetKingdom/KeyCape OIDC with
MFA and the `platform-admin` policy. Remaining production-readiness work is
explicitly tracked in `NET-WP-0017`; rebuild automation and validation
improvements are tracked in `NET-WP-0018`.
## Acceptance Criteria
- The setup operator and king credential model are recorded without secret
values.
- The custody mode is explicit before OpenBao initialization.
- OpenBao root-token use is limited to bootstrap or break-glass handling.
- Routine admin access has a non-root path and a target NetKingdom IAM path.
- Production readiness has a clear gate for independent escrow, audit, restore,
reset/rotation, and reopening under king oversight.