Files
net-kingdom/workplans/NET-WP-0017-it-security-readiness-for-user-onboarding.md

15 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on, state_hub_workstream_id
id type title domain repo status owner topic_slug created updated depends_on state_hub_workstream_id
NET-WP-0017 workplan IT Security Readiness For User Onboarding netkingdom net-kingdom active codex netkingdom 2026-05-26 2026-06-01
NET-WP-0015
NET-WP-0016
RAIL-PL-WP-0002
385de708-fd59-4bab-a4f4-28c1c476b3ea

NET-WP-0017 - IT Security Readiness For User Onboarding

Goal

Finish the remaining NetKingdom and Railiance security setup needed before ordinary platform users, tenant admins, or fabric admins are onboarded.

NET-WP-0015 established the king credential, OpenBao bootstrap ceremony, and guided control surface. This workplan is the narrower finish-line plan: routine admin access must use NetKingdom identity, bootstrap-era material must be retired or explicitly accepted, audit/recovery posture must be credible, and a first non-root onboarding dry run must prove the lifecycle model.

Current Evidence

  • platform-root exists in LLDAP, belongs to net-kingdom-admins, has MFA, and completed KeyCape OIDC login.
  • Railiance OpenBao is initialized, unsealed, and post-unseal verified.
  • OpenBao initial configuration was applied; platform/ KV and Kubernetes auth exist.
  • The initial OpenBao root token is recorded as revoked.
  • Trial unseal shares were rotated.
  • The KeyCape openbao-admin client is live and verified, including the public https://kc.coulomb.social route and certificate.
  • OpenBao OIDC auth configuration is applied; MFA-backed OpenBao admin login completed successfully and the resulting token lookup showed the platform-admin policy for platform-root.
  • Declarative local OpenBao audit and authenticated audit visibility are complete; enterprise durable tenant-aware audit retention has been split into the standalone audit-core product. Residual taint closeout, cleanup/rotation, and the first ordinary-user onboarding dry run are still pending.

Tasks

T01 - Finish OIDC-Backed OpenBao Admin Login

id: NET-WP-0017-T01
status: done
priority: high
state_hub_task_id: "9b087bbd-631b-4316-b94d-a8265a05b065"

Run the fixed OpenBao OIDC helper, record the non-secret completion flag, then verify platform-root can complete:

bao login -method=oidc -path=keycape role=platform-admin

The verification must prove the resulting OpenBao token has the intended platform-admin policy without relying on the initial root token or a manually minted temporary operator token.

2026-05-29: DNS and ACME issuance for kc.coulomb.social are healthy: cert-manager issued kc-tls, and sso-mfa/k8s/keycape/verify-openbao-client.sh passes against the live KeyCape route. configure-openbao-oidc.sh has applied the OpenBao auth/keycape OIDC configuration and platform-admin role. The remaining T01 gate is the human browser login with MFA and a token lookup that shows the expected OpenBao platform-admin policy.

2026-06-01: Added a guided console recovery action for the observed privacyIDEA state-loss blocker: if the live instance lacks the coulomb realm, LLDAP resolver, or self-service policies, the operator can run Repair privacyIDEA realm and self-service from Usecases & Runbooks. The action does not store secrets; it calls repair-realm-live.sh, prompts live, creates temporary env files for bootstrap-realm.sh, removes them on exit, and then runs verify-t06.sh. After repair, platform-root TOTP enrollment/re-enrollment and the MFA-backed bao login proof are still required.

2026-06-01: Fixed the follow-up OpenBao OIDC token exchange user not found error caused by live keycape-config drift: the Secret had lost the non-secret LLDAP lookup fields userOU: ou=people and groupOU: ou=groups. The KeyCape live patch helper now enforces those fields alongside the openbao-admin client, the live Secret was patched, KeyCape was restarted, and verify-openbao-client.sh passes again.

2026-06-01: Deployed a KeyCape runtime lookup fix for the remaining user not found token-exchange failure after config drift was ruled out. The LDAP adapter now treats provisioning metadata validation failures as runtime warnings instead of blocking token issuance for an otherwise resolved LLDAP user. The patched image main-runtime-lookup-0601 is live and verify-openbao-client.sh passes after rollout.

2026-06-01: Deployed the follow-up KeyCape OIDC nonce fix after OpenBao rejected the exchanged ID token with invalid id_token nonce. KeyCape now persists the original authorization nonce through pending state and the authorization-code session, then emits it in the ID token. The patched image main-nonce-0601 is live, reports 1/1 ready, and verify-openbao-client.sh passes after rollout.

2026-06-01: Fixed the next OpenBao role configuration failure, error converting claim 'groups' to string. KeyCape correctly emits groups as an array for groups_claim; OpenBao only failed because the role also copied that array through scalar claim_mappings. The helper now leaves groups in groups_claim/bound_claims and maps only scalar email and preferred_username metadata.

2026-06-01: The operator reached the OpenBao success page, "Signed in via your OIDC provider", after reapplying the corrected role. The follow-up terminal proof showed token_policies/policies containing platform-admin, token_meta_role: platform-admin, and token_meta_username: platform-root. T01 is closed; the pasted short-lived token should be treated as disclosed and revoked or allowed to expire after the check.

T02 - Close OpenBao Audit And Recovery Production Gates

id: NET-WP-0017-T02
status: in_progress
priority: high
state_hub_task_id: "909944bd-843a-4a63-8c87-536cea052a88"

Resolve the remaining OpenBao production-trust gates:

  • configure audit declaratively if API-managed audit remains rejected;
  • record the interim Audit Core interface used before enterprise durable audit retention is implemented;
  • hand off durable tenant-aware audit shipping beyond the audit PVC to audit-core;
  • retain non-secret restore-drill evidence and repeat the drill if any material changed;
  • record emergency seal/unseal drill evidence; and
  • identify the next independent escrow holder for moving beyond temporary single-king custody.

2026-06-01: Started the OpenBao audit/recovery closeout. Railiance source now has a declarative OpenBao file-audit stanza in helm/openbao-values.yaml, and its initial-config helper now verifies bao audit list instead of trying to create audit devices through the API. The Railiance post-unseal verifier also warns when /openbao/audit/openbao-audit.log is missing or empty. Live non-secret checks still show OpenBao healthy and unsealed with Bound data/audit PVCs, but the live Helm values do not yet include the declarative audit stanza and the audit directory is empty. Do not move production secrets into OpenBao until a planned Helm rollout is performed with unseal shares available, file/ audit is visible, an audit log is written, durable audit shipping beyond the PVC is selected, and restore/emergency drill evidence plus a next escrow holder are recorded.

2026-06-01: Completed the attended live rollout of the Railiance declarative file-audit configuration. The Helm release was upgraded, the OnDelete StatefulSet pod was deliberately recycled, the operator unsealed the new pod, and make openbao-verify-post-unseal now reports OpenBao 2.5.4, Sealed: false, an audit directory, and a non-empty /openbao/audit/openbao-audit.log. The Railiance source now pins the live OpenBao image tag to 2.5.4 after the chart upgrade advanced the runtime from 2.5.3; a follow-up Helm revision 3 applied the explicit tag while the pod remained ready. T02 remains open for the authenticated bao audit list proof, durable audit shipping beyond the audit PVC, restore-drill evidence, emergency seal/unseal drill evidence, and the next independent escrow holder.

2026-06-01: Added a Railiance evidence-only helper for the authenticated OpenBao proof: make openbao-verify-authenticated prompts for an approved OpenBao token without echoing it and verifies file/ audit visibility, platform/ secrets, kubernetes/ auth, keycape/ auth, and a non-empty audit log without mutating OpenBao configuration. The helper can also reuse a still-valid pod token helper with OPENBAO_VERIFY_AUTH_ARGS=--use-token-helper, avoiding token movement through the local shell. It is ready to run with the MFA-backed platform-root/platform-admin path. Durable audit shipping remains open; the audit PVC is not a durable sink and non-secret evidence hashes or State Hub notes are not substitutes for retained audit log custody.

2026-06-01: Completed the authenticated OpenBao proof through the MFA-backed KeyCape path without printing token material. A fresh bao login -no-print -method=oidc -path=keycape role=platform-admin browser flow cached the pod token helper, then make openbao-verify-authenticated OPENBAO_VERIFY_AUTH_ARGS=--use-token-helper passed. Evidence: OpenBao is unsealed on 2.5.4, file/ audit is visible, platform/ secrets are visible, kubernetes/ and keycape/ auth methods are visible, and the audit log grew from 7969 bytes to 23330 bytes during the check. The cached verifier token was then revoked with bao token revoke -self. T02 remains open for durable audit shipping beyond the audit PVC, restore-drill evidence, emergency seal/unseal drill evidence, and the next independent escrow holder.

2026-06-01: Split enterprise audit retention out of this task and into the new standalone /home/worsch/audit-core repo. audit-core now has INTENT.md, a product requirements definition, and a minimal replaceable mock backend that writes JSONL audit events to /tmp/audit-core/audit-YYYYMMDDTHH.jsonl and cleans up files older than seven days. A smoke event for the OpenBao authenticated readiness proof was written through the mock interface, and audit-core tests pass. This mock backend is acceptable for bootstrap/development wiring and NetKingdom UI integration, but it is not durable audit custody and must not be presented as enterprise retention. NET-WP-0017-T02 now treats the full tenant-aware durable audit fabric as an audit-core follow-up rather than an OpenBao bootstrap subtask. Remaining T02 gates are restore-drill evidence, emergency seal/unseal drill evidence, the next independent escrow holder, and an explicit risk note if ordinary onboarding proceeds before the production Audit Core sink exists.

2026-06-01: Tightened the restore-drill evidence gate. The local bootstrap metadata currently says restore_drill_passed: true, but that checkbox alone does not preserve enough non-secret evidence for review. Railiance now has a restore evidence JSON template and make openbao-validate-restore-evidence validator that checks for snapshot hashes, encrypted-snapshot hash/location, isolated restore completion, unseal/status/test-secret verification, isolated environment destruction, and no_secret_material_recorded. The NetKingdom control surface now includes a Validate restore drill evidence runbook card. T02 should not count the restore gate closed until a real non-secret evidence file from the prior or repeated drill passes that validator.

T03 - Close Trial Taint And Retire Bootstrap Admin Paths

id: NET-WP-0017-T03
status: todo
priority: high
state_hub_task_id: "a6cd4325-8f3b-46bb-b810-ca816c35cb29"

Review all access paths created during the trial exposure and record the compromise response complete only after the operator has either rotated, revoked, reset, or explicitly accepted residual risk for:

  • temporary OpenBao platform-admin tokens;
  • bootstrap/root-token-derived paths;
  • early LLDAP/Authelia/KeyCape admin credentials;
  • local plaintext secret workspaces;
  • bootstrap service tokens; and
  • any copied command output or local shell history that may contain secret values.

T04 - Harden Bootstrap Infrastructure Before User Onboarding

id: NET-WP-0017-T04
status: todo
priority: high
state_hub_task_id: "12c31f76-68f4-4d2b-853a-f3185cfc761c"

Complete the minimum hardening before ordinary users are onboarded:

  • restrict direct administrative access to LLDAP and privacyIDEA to approved operator networks or tunnels;
  • verify no privileged login path bypasses MFA for platform-admin authority;
  • rotate or reset bootstrap-era database, admin, and service credentials that were created before custody was established;
  • confirm host/workload checks and vulnerability scans are run or explicitly deferred with owner/date; and
  • update the bootstrap console state to cleanup_complete only when these checks are recorded.

T05 - Implement First User Lifecycle Operator Flow

id: NET-WP-0017-T05
status: todo
priority: high
state_hub_task_id: "aec3ac45-18be-4b04-a863-0c8c70693739"

Turn the documented user lifecycle UX into the first practical operator flow for:

  • onboarding a scoped non-root user;
  • temporarily locking that user;
  • permanently offboarding that user;
  • reviewing credentials and MFA state; and
  • creating a fabric/tenant admin without platform-root authority.

The flow can begin as console/UI action cards, but it must show effective access before saving and must not expose secrets.

T06 - Run A Non-Root Onboarding Dry Run

id: NET-WP-0017-T06
status: todo
priority: high
state_hub_task_id: "c149b2f0-c9ee-4c95-a1df-b25ed0d20579"

Create a test or first real non-root user using the new lifecycle flow. Verify:

  • LLDAP identity and groups;
  • MFA enrollment through privacyIDEA;
  • KeyCape OIDC claims;
  • expected application or platform scope;
  • no platform-root or OpenBao root authority;
  • lock/offboard path can be exercised or simulated; and
  • non-secret audit/progress evidence is recorded.

This is the final gate before declaring the platform ready for normal user onboarding.

T07 - Review And Retire Superseded Bootstrap Workplans

id: NET-WP-0017-T07
status: todo
priority: medium
state_hub_task_id: "e9ceafb2-14c0-4352-9ac7-e31628feb045"

After T01-T06 complete, review NET-WP-0015, NET-WP-0016, RAIL-PL-WP-0002, and older NetKingdom credential/bootstrap workplans. Mark completed work finished or archived, and leave only longer-horizon items such as multi-custodian upgrade, enterprise federation, dynamic database credentials, object-storage STS vending, and application onboarding contracts.

Acceptance Criteria

  • Routine OpenBao administration works through NetKingdom/KeyCape OIDC and MFA.
  • The initial root token and temporary OpenBao admin tokens are not normal operating paths.
  • Audit, recovery, emergency seal, and restore evidence are recorded without secret values.
  • Bootstrap-era privileged credentials have been rotated, reset, revoked, or explicitly accepted as residual risk.
  • A non-root user onboarding dry run succeeds and proves lock/offboard/review paths.
  • The bootstrap console can honestly move beyond Admin Identity Integration into cleanup and reopening.